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WO 98/23781 PCT/US97/21861 

LIGAND DETECTION SYSTEM AND METHODS OF USE THEREOF 

CROSS-REFERENCE TO RELATED APPLICATIONS 

The present application is a continuation-in-part of U.S. provisional 
application serial number 60/031,793, filed November 26, 1996, and U.S. provisional 
5 application serial number 60/043,560, filed April 15, 1997, both of which provisional 
applications are fully incorporated herein by reference. 
BACKGROUND OF THE INVENTION 
1 . Field of the Invention 

The present invention relates to a novel ligand detection system and methods 

10 of using the system to identify ligands capable of specifically binding orphan protein 
domains. The invention particularly relates to peptide ligands capable of specifically 
binding an orphan domain such as the PDZ domain of neuronal nitric oxide synthase 
(nNOS). Further provided are methods of detecting the peptide ligands and orphan 
protein domains capable of specifically binding the peptide ligands. The present 

15 invention is useful for a variety of applications including detecting peptide ligands 
with therapeutic capacity to treat human diseases. 

Thirteen billion distinct peptides were screened to determine that the nNOS- 
PDZ domain binds with nanomolar affinity to peptides ending Asp-X-Val. Preference 
for Asp at the -2 peptide position is mediated by Tyr-77 of nNOS and mutating this 

20 residue to His changes the binding specificity from Asp-X-Val to Thr-X-Val. Guided 
by the Asp-X-Val consensus, candidate nNOS interacting proteins have been 
identified including glutamate and melatonin receptors. The peptides comprising the 
consensus sequence Asp-X-Val are useful in altering the interaction of the nNOS PDZ 
domain with its cognate interacting proteins to prevent the overproduction of NO. 

25 Altering the interaction between these proteins with the peptides of the invention can 
be used to treat many neurodegenerative diseases, including stroke, ALS, Alzheimer's 
disease, Parkinson's disease and Huntington's disease. The peptides will also be 
useful for the treatment of muscular dystrophies such as Duchenne muscular 
dystrophy and motility disorders such as irritable bowel syndrome. 

30 The present invention also relates to a method of identifying the amino acid 

sequence of a peptide or protein that interacts with a protein domain of interest 
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(orphan protein domain). The disclosed Protein Interaction Network (PIN) uses an in 
vitro selection strategy that identifies the amino acid sequences which interacts with a 
given orphan protein domain. This sequence information is then used to search 
nucleic acid and protein sequence libraries. Interacting PINs from different orphan 
5 protein domains are assembled into an electronic resource that can be searched with 
the sequence of a protein domain of interest. 
2. Background 

All publications and patent applications herein are incorporated by reference 
to the same extent as if each individual publication or patent application was 

1 0 specifically and individually indicated to be incorporated by reference. 

A fundamental area of inquiry in biology is the analysis of interactions 
between proteins. Proteins are complex macromolecules made up of covalently 
linked chains of amino acids. Each protein assumes a unique three dimensional shape 
determined principally by its sequence of amino acids. Many proteins consist of 

15 smaller units termed domains, which are continuous stretches of amino acids able to 
fold independently from the rest of the protein. Some of the important functions of 
proteins are as enzymes, polypeptide hormones, nutrient transporters, structural 
components of the cell, hemoglobins, antibodies, nucleoproteins, and components of 
viruses. 

20 Protein-protein interactions enable two or more proteins to associate. A large 

number of non-covalent bonds form between the proteins when two protein surfaces 
are precisely matched, and these bonds account for the specificity of recognition. 
Protein-protein interactions are involved, for example, in the assembly of enzyme 
subunits; in antigen-antibody reactions; in forming the supramolecular structures of 

25 ribosomes, filaments, and viruses; in transport; and in the interaction of receptors on a 
cell with growth factors and hormones. Products of oncogenes can give rise to 
neoplastic transformation through protein-protein interactions. For example, some 
oncogenes encode protein kinases whose enzymatic activity on cellular target proteins 
leads to the cancerous state. Another example of a protein-protein interaction occurs 

30 when a virus infects a cell by recognizing a polypeptide receptor on the surface, and 
this interaction has been used to design antiviral agents. 

Evidence has accumulated over the past years that protein-protein interactions 
are often mediated by protein modules or domains such as src homology domain 2 
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(SH2) and src homology domain 3 (SH3). Recently a novel modular domain has been 
identified in a diverse set of proteins that are typically associated with cell junctions, 
including synapses of the central nervous system. These novel modular domains are 
known as PDZ domains. PDZ domains have also been called "GLGF repeats" and 
5 Odisks-large homology repeats" (DHRs) and consist of about 80 amino acids. These 
domains were first identified as repeated sequences in the neuron-specific 
postsynaptic density protein (PSD-95/SAP-90), the Drosophila septate junction 
protein discs-large (dig), and the epithelial tight-junction protein zona occludens-1 
(ZOl) (K. Cho et al. Neuron, 9:929-942 (1992); S. Gomperts, Cell, 84:659-662 

10 (1 996)). PDZ domains occur in structural proteins of the cytoskeleton and in a 

heterogeneous family of enzymes that associate with the cytoskeleton, suggesting a 
role for PDZ domains in protein-protein interactions (C. Ponting et al., Trends in 
Biological Sciences, 20:102-103 (1995)). Supporting this notion, the three PDZ 
domains within PSD-95 were first shown to bind the carboxy-terminal Ser/Thr-X-Val 

15 motif found in certain N-methyl-D-aspartate (NMD A) type glutamate receptors and in 
Shaker type potassium channel subunits (E. Kim et al., Nature, 378:85-88 (1995); H. 
Kornau et al., Science, 269:1737-1740 (1995)). Clustering and localizing channels at 
synaptic sites is one function of the concatenated domains (M. Sheng, Neuron, 
17:575-578(1996)). 

20 The crystal structures of the third PDZ domains of PSD-95 and dig have been 

determined (D. Doyle et al., Cell, 85:1067-1076 (1996); J. Cabral et al., Nature, 
382:649-652 (1996)). The PDZ structures show a "carboxylate binding loop", 
containing the signature GLGF sequence, which interacts with the C-terminal 
carboxylate group of the peptide ligand. The peptide ligand forms main chain 

25 interactions with backbone amide groups in a conserved helix and b strand of the PDZ 
domain. A critical sequence-specific interaction has been noted between the 
threonine at the -2 position of the bound peptide and a histidine residue in the PDZ 
domain (D. Doyle et al., Cell, 85:1067-1076 (1996)). This histidine is conserved in 
all PDZ repeats of dig, PSD-95 and related proteins. This histidine, however, is not 

30 conserved in other PDZ domains (C. Ponting et al., Trends in Biological Sciences, 
20:102-103 (1995)) suggesting distinct peptide-binding specificities. 

Since PDZ domains mediate specific protein-protein interactions, critical 
information in understanding the biological function of PDZ containing proteins is to 
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determine physiological ligand(s) for orphan PDZ domains. Recent evidence shows 
that interaction between the PDZ domain and peptide ligands can be regulated by 
differential affinity (B. Muller et al., Neuron, 17:255-265 (1996)) and by protein 
phosphorylation (N. Cohen Neuron, 17:759-767 (1996)). These mechanisms, 
5 however, are not adequate to explain the diversity of PDZ-target protein interactions 
in both excitable and non-excitable tissues. 

Nitric oxide (NO), an endogenous signaling molecule, plays critical roles in 
nervous, immune, and cardiovascular function (D. Bredt et aL, Ann. Rev. Biochem., 
63:175-195 (1994); M. Marietta, J. Biol Chem., 268:12231-12234 (1993); S. 

10 Moncada et aL, N Eng. J. Med., 329:2002-2012 (1993)). Physiological studies have 
demonstrated numerous functions for neuron-derived NO, produced primarily by the 
neuronal NO synthase (nNOS). However, excess nNOS activity mediates brain injury 
in cerebral ischemia and in animal models of Parkinson's disease (T. Dawson et al., 
Ann. Neurol, 32:297-31 1 (1992); P. Hantraye et al., Nature Medicine, 2:1017-1021 

15 (1996); Z. Huang et al., Science, 265:1883-1885 (1994)). Excess nNOS activity has 
been broadly linked with many neurodegenerative diseases, motility disorders and 
muscular dystrophies, including Alzheimer's disease, Huntington's disease (see 
generally D. Bredt et al., Nature, 351:714-718 (1991)). nNOS activity must therefore 
be tightly regulated. One level of regulation is reflected by molecular targeting of the 

20 nNOS to specific intracellular membrane domains (C. Aoki et al., Brain Res., 620:97- 
1 13 (1993)). This subcellular localization is mediated by the N-terminus of nNOS, 
which contains a PDZ domain (J. Brenman et al., Cell, 82:743-752 (1995)). This N- 
terminal domain of nNOS interacts with the PDZ domain of a 1-syntrophin and the 
second PDZ domains of PSD-95 and PSD-93. These interactions target nNOS to 

25 synaptic sites in skeletal muscle and brain (J. Brenman et al., Cell, 84:757-767 
(1996)). The structural details of these PDZ-PDZ interactions are not yet known. 

Several lines of evidence suggest that additional binding partners for the PDZ 
domain of nNOS may also exist. First, not all membrane-associated nNOS in brain is 
bound to PSD-95 and related proteins (J. Brenman et al., Journal o/Neuroscience, 

30 (1996) (in press) unpublished observations). Also, in certain muscle diseases, nNOS 
does not interact properly with a 1-syntrophin at the skeletal muscle sarcolemma (D. 
Chao et al., Journal of Experimental Medicine, 184:609-618 (1996)). We therefore 
sought to determine whether specific carboxylate-peptides might also associate with 



WO 98/23781 



PCT/US97/21861 



-5- 

the PDZ domain of nNOS. Identification of such peptides would facilitate the 
structure and function study of PDZ domains. Also, the in vitro defined peptide 
sequences may help identify additional nNOS interacting proteins. 

Protein-protein interactions have been generally studied in the past using 
5 biochemical techniques such as cross-linking, co-immunoprecipitation and co- 

fractionation by chromatography. One of the disadvantages of these techniques is that 
interacting proteins often exist in very low abundance and are, therefore, difficult to 
detect. Another major disadvantage is that these biochemical techniques involve only 
the proteins, not the genes encoding them. When an interaction is detected using 

1 0 biochemical methods, the newly identified protein often must be painstakingly 

isolated and then sequenced to enable the gene encoding it to be obtained. Another 
disadvantage is that these methods do not immediately provide information about 
which domains of the interacting proteins are involved in the interaction. 

In vitro determination of ligands for peptide-binding domains, such as 5H3 

15 and SH2 motifs, has been achieved using two types of random peptide libraries (A. 
Sparks et al., Methods EnzymoL, 255:498-509 (1995); M. Sheng, Neuron, 17:575-578 
(1996); S. Zhou et al., Methods Enzymol, 254:523-535 (1995); and review by M. 
Gallop et al., Journal of Medicinal Chemistry, 37:1233-1251 (1994)). One strategy 
utilizes the filamentous phage coat protein to display random N-terminal peptides. By 

20 repeated rounds of affinity panning and amplification, individual interacting peptides 
can be identified by sequencing the corresponding coding region of phage DNA (A. 
Sparks et al., Methods EnzymoL, 255:498-509 (1995)). A second approach uses 
soluble random peptides that are chemically synthesized. By affinity purification of a 
mixture of bound peptides and subsequent peptide sequencing, a population based 

25 consensus can be deduced (S. Zhou et al., Methods Enzymol, 254:523-535 (1995)). 
Because the phage display system only accommodates N-terminal peptides, it can not 
be used to select C-terminal peptides for the PDZ domain. Although chemical peptide 
libraries are applicable, the approach has difficulties in accommodating cysteine and 
tryptophan and does not provide individual ligand sequences. As a result, analyses of 

30 chemical libraries cannot resolve compensatory effects potentially present in peptides 
of low abundance and may miss high affinity sequences containing tryptophan and 
cysteine. Thus, it would be desirable to use a genetic strategy to screen a large pool 
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of C-terminal peptides containing all 20 amino acids to identify individual PDZ 
binding peptides. 

SUMMARY OF THE INVENTION 

In one aspect, the invention relates to peptides capable of altering the 
5 interaction between the nNOS PDZ domain and the proteins which this domain 
interacts. The peptides preferably alter the interactions between the nNOS PDZ 
domain and melatonin or non-NMDA type glutamate receptors. The peptides of the 
invention are useful in the formulation of therapeutic compositions which alter 
intermolecular binding between the nNOS PDZ domain and the proteins which this 
10 domain interacts in vivo. Via inhibition -of these interactions, the peptides of the 
invention will be useful in suppressing the production of excess levels of NO which 
are neurotoxic and contribute to myofiber necrosis. For example, the peptides of the 
invention can be used to treat many neurodegenerative diseases, including stroke, 
ALS, Alzheimer's disease, Parkinson's disease and Huntington's disease. The 
15 peptides are also useful for the treatment of muscular dystrophies such as Duchenne 
muscular dystrophy and motility disorders such as irritable bowel syndrome. 

Another aspect of the invention is to provide peptides comprising the general 
sequence D-X-V-COOH wherein D=Aspartic acid, X is variable and V=Valine. 

Another object of the invention is to provide peptides capable of altering the 
20 interaction between the nNOS PDZ domain and the proteins which this domain 
interacts which are useful as commercial laboratory or bioprocess reagents. 

Another object of the invention is to provide peptides which can be used as 
molecular probes that specifically label nNOS. For instance, the peptides of the 
invention can be labeled according to standard procedures in the art and can be used 
25 as molecular probes to detect nNOS in vivo or in vitro. 

The invention also provides a kit comprising peptides which interact with the 
PDZ domain of nNOS. 

Another aspect of the invention is isolated nucleic acid sequences that encode 
the peptides described herein. 
30 Another object of the invention is to couple a genetic system that identifies 

peptides which interact with a given protein domain (orphan protein domain) with the 
available electronic sequence databases. The genetic system provides the sequence of 
the peptide which interacts with the orphan protein domain. This sequence is then 
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used to identify proteins already present in the electronic nucleic acid and protein 
sequence databases. A Protein Interaction Network (PIN) is then assembled which 
correlates the peptide sequences which interact with a given orphan protein domain. 
Assembly of many different PINs results in the assembly of a Super Protein 
5 Interaction Network (SPIN) which will serve as an electronic extension for existing 
sequence databases. This allows the researcher to search the database with the 
sequence of a given orphan protein domain for peptide sequences which are known to 
specifically interact with a given orphan protein domain. 

The invention also relates to a peptide ligand detection system that includes a 

1 0 random peptide library preferably of at least about 1 0 6 members comprising a 
recombinant DNA vector encoding a DNA binding protein. The DNA binding 
protein is selected to specifically bind a DNA sequence on the vector. The DNA 
binding protein encoded by the DNA vector comprises a random peptide sequence 
covalently linked to the DNA binding protein as an in-frame fusion protein. The 

15 fusion protein is typically formatted so that the DNA vector can encode preferably at 
least about 10 6 different fusion proteins up to about 10 8 fusion proteins or more, each 
of which is capable of specifically binding the DNA sequence on the vector. The 
peptide ligand detection system further includes an orphan protein domain sequence 
immobilized on a solid support that is capable of specifically binding the random 

20 peptide of the DNA binding protein. 

Significantly, the ligand detection system of the present invention utilizes an 
immobilized orphan protein domain sequence to specifically bind the random peptide 
of the in-frame fusion protein. Typically, the orphan protein domain sequence is a 
contiguous or non-contiguous amino acid sequence within the linear sequence of a 

25 protein of interest. Sometimes the orphan protein domain sequence is referred to as a 
protein module. In contrast, prior ligand detection systems using random peptide 
libraries rely on substantially larger molecules to bind the ligand, e.g., receptors, 
antibodies, or enzymes. Exemplary orphan protein domain sequences are illustrated 
below in Figure 7. 

30 The peptide ligand detection system can further include an inducer molecule 

capable of specifically binding the DNA binding protein. Typically, the inducer 
molecule is selected to release the recombinant DNA vector from the immobilized 



WO 98/23781 



PCT/US97/21861 



orphan protein domain sequence. In particular, the inducer molecule can be 
isopropylthio-p-D-galactoside (IPTG). 

A peptide ligand detection system in accord with the present invention can 
include one of a variety of suitable recombinant DNA vectors. That is, the 
5 recombinant DNA vectors can encode a variety of suitable DNA binding proteins and 
DNA sequences capable of being bound by the DNA binding proteins. 

For example, the DNA binding protein of the peptide ligand detection system 
can include a prokaryotic repressor protein sequence. In addition, the DNA sequence 
bound by the DNA binding protein can be a prokaryotic operator sequence. More 
10 specifically, the prokaryotic repressor protein sequence can be a lac repressor or a 
fragment thereof capable of specifically binding the DNA sequence on the 
recombinant DNA vector. In addition, the prokaryotic operator sequence can be lac O 
or a fragment thereof capable of being specifically bound by the prokaryotic repressor 
protein sequence. 

1 5 As noted, the recombinant DNA vectors of the random peptide library are 

formatted to express the random peptide as a fusion protein. A DNA binding protein 
of the invention typically features high avidity binding to DNA and has a region 
preferably at the C-terminus of the protein that can accept an amino acid sequence 
insertion without interfering with the DNA binding activity of the protein. The half- 

20 life of a specific binding pair formed between the DNA binding protein and the 

recombinant DNA vector must be long enough for screening to occur. In general, that 
half-life will be at least about one to four hours or longer. The half-life of the specific 
binding pair formed between the random peptide and the immobilized orphan protein 
domain will also be about one to four hours or longer. 

25 If desired, the peptide ligand detection system can include an in-frame peptide 

linker sequence, e.g., between the prokaryotic repressor protein sequence (or 
fragment) and the random peptide sequence. 

A peptide ligand detected by the present ligand detection system is capable of 
specifically binding the immobilized orphan protein domain of interest. The binding 

30 affinity (EC 50 ) of the specific binding interaction depends on several parameters such 
as the degree of binding affinity desired and the complexity of the random peptide 
sequence. However, in general the binding affinity will be in the micromolar to 
nanomolar range for most immobilized orphan protein domains. 
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As will be discussed more fully below, an exemplary peptide ligand in accord 
with the present invention comprises between about 3, 6, 7, 8, 9, 10, 12, 15, 20, 25, 
30, 35, 40, 50 or more amino acids. For example, the present invention provides a 
peptide ligand comprising about 3, 6, 7, 8, 9 or 1 0 amino acids in which the C- 
5 terminal sequence of the peptide ligand consists of the sequence D-X-V-COOH, 
wherein D is Asp, X is any amino acid, preferably any of the 20 common natural 
amino acids, and V is Val. That peptide ligand has been found to specifically bind a 
specified orphan protein domain (PDZ). 

In general, a peptide ligand in accord with the invention has a binding affinity 
1 0 (EC50) for an orphan protein domain preferably in the micromolar to nanomolar 
range. Preferred peptide ligands have an EC50 in the nanomolar range. 

In particular, the immobilized orphan protein domain can be a PDZ domain 
such as those obtained from a variety of known proteins such as nitric oxide synthase 
(nNOS), post-synaptic density protein (PSD-95/SAP-90), post-synaptic density 
1 5 protein (PSD-93), epithelial tight-junction protein zona occludens- 1 (ZO 1 ), N-methyl- 
D-aspartate (NMDA) type glutamate receptor, Shaker-type potassium channel 
subunit, and 1-syntrophin. 

The invention further provides therapeutic compositions comprising a peptide 
ligand of the present invention. The therapeutic compositions are preferably provided 
20 in a pharmaceutically acceptable vehicle, e.g. sterile and pyrogen-free. Examples of 
preferred therapeutic compositions are specified below. 

Further provided are isolated nucleic acids encoding peptide ligands of the 
present invention and particularly DNA vectors comprising the isolated nucleic acids. 

The present invention also provides a method of detecting a peptide ligand 
25 capable of specifically binding an orphan protein domain of interest. In general, the 
method includes lysing transformed cells comprising the random peptide library 
generally discussed above. The lysing is under conditions such that the DNA binding 
protein comprising the random peptide remains bound to the recombinant DNA 
vector. The method further includes the steps of contacting the fusion proteins of the 
30 random peptide library to an immobilized orphan protein domain under conditions 
conducive to specific peptide-orphan protein domain binding and isolating a 
recombinant DNA vector encoding a fusion protein that specifically binds to the 
orphan protein domain. 
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In most cases, the method will further include the steps of transforming a host 
cell with the isolated recombinant DNA vector obtained, repeating the lysing and 
contacting steps and isolated a selected recombinant DNA vector. As will be shown 
below in the examples, practice of this method leads to amplification of the selected 
5 recombinant DNA vector. 

The method will also typically includes the steps of determining the amino 
acid sequence of the random peptide encoded by the selected recombinant DNA 
vector, and searching a protein sequence database to identify an orphan protein 
domain in the database comprising the random peptide. 

1 0 If desired, the method can further include the step of assembling a protein 

interaction network (PIN) sufficient to correlate (particularly match) a plurality of 
random peptide sequences to the orphan protein domain. In this method, the plurality 
of random peptide sequences are capable of binding the correlated orphan protein 
domain with a binding affinity in the micromolar to nanomolar range as noted below. 

1 5 The method can further include assembling a super protein interaction network 

(SPINS) comprising a plurality of protein interaction networks (PINs) sufficient to 
serve as an electronic extension database for the protein sequence database. 

Typically, the assembly is assisted by one or more suitable computer programs 
such as those generally known in the field for compiling protein and/or nucleic 

20 sequences in a matrix or matrix-type format The matrix or matrix-type format can be 
readily searched with a test sequence that can be, e.g., a peptide ligand sequence or 
orphan domain sequence in accord with the invention. 

The invention further provides a method of detecting a peptide ligand capable 
of specifically binding an orphan protein domain of interest, the method comprising 

25 searching the super protein interaction network (SPINS) with an amino acid sequence 
comprising an orphan protein domain of interest, and identifying the peptide ligand 
capable of specifically binding the orphan protein domain of interest. The peptide 
ligand can be obtained from any suitable source such as any of the random peptide 
libraries discussed previously. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic diagram showing affinity selection from a C-terminal 
peptide library. 
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Figures 2A is a graph showing affinity selection of peptides interacting with 
PDZ3 of PSD-95 by ELISA. 

Figure 2B is an alignment of deduced amino acid sequences of PDZ3 specific 
clones. Eleven clones were randomly chosen and sequenced. Single letter code for 
5 20 amino acids are used. Italic letters indicate amino acids present at the end of the 
linker which separates Lac I and the fused peptide. indicates a stop codon. 

Figure 3A is a graph showing in vitro selection of peptides interacting with 
nNOS-PDZ. The graph shows identification of nNOS-PDZ interacting clones by 
ELISA. After 4 rounds of affinity panning, a total of 150 individual clones were 
10 randomly selected and tested for interaction with nNOS-PDZ by ELISA as described 
in experimental procedures. Clones 1 to 48 are shown (horizontal axis). Gray bars: 
BSA; open bars: GST-NAB HE rg + BSA; closed bars: GST-nNOS-PDZ+ BSA. 

Figure 3B and 3C illustrate a sequence alignment of 95 independent nNOS 
binding peptides (NBPs). The deduced amino acid sequence of the clones were 
1 5 obtained and aligned according to the first stop codon (*). The italic Gs are part of 
linker region. The library template (GGG-X15-*) is shown at the top of the sequence 
alignment. 

Figures 4A -41 are graphs showing determinations of a consensus nNOS 
binding peptide (NBP). Normalized amino acid abundance of the final nine residues 
20 from the population of 95 independent nNOS binding peptides (closed bars) is 
compared in each figure with codon frequency in the original library (open bars). 
Residues in the library linker region were not included in each figure. 

Figure 5A is a graph showing all 95 NBPs fail to interact with PDZ3. ELISA 
results of 36 randomly chosen NBP clones are shown. Horizontal axis: NBP clone 
25 number, vertical axis: ELISA signal normalized against clones with strongest binding. 

Figure 5B is a graph illustrating that mutating Y77D78 to H77E78 changes the 
nNOS PDZ binding specificity from D-X-V to T-X-V. ELISA results of two high 
affinity peptides are shown. NBP-161 for nNOS (EC 5 o=~8 nM) and PD-325 for 
PDZ3 (EC=~2 nM) are expressed as maltose binding protein fusion and affinity 
30 purified on amylose agarose beads (see Experimental Procedures). 

Figure 5C is a graph showing that the aspartate at the -2 position is critical for 
NBP binding. Single amino acid substitutions at the -2 position were obtained. The 
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peptides were expressed at maltose binding protein fusion at the C-terminus (see 
Experimental procedures). ELISA results of seven mutants are shown. 

Figure 5D is a representation of a Western immunoblot. Solubilized brain 
extracts were incubated with amylose resin alone (lane 1), amy lose resin saturated 
5 with a maltose binding protein fusion containing a C-terminal NPB-123 (lane 2) or 
with the same fusion protein in which the -2 aspartate was changed to threonine (lane 
3). The beads were washed and retention of nNOS was detected by western blotting. 
Molecular weight standards in kDa are marked on the left. 

Figure 6 is a schematic diagram showing that functional nNOS PDZ has a 
10 uniquely large structure. The location of the PDZ domain is shown in the N-terminus 
of nNOS. Interaction of nNOS with the PDZ domains of PSD-93 requires amino 
acids 16-130 of nNOS. Association of nNOS fusions with PSD-93 was evaluated by 
the yeast two hybrid system and is expressed as P-galactosidase units. Interactions of 
five different NBPs (#64-68) with nNOS fusions were evaluated by ELISA and is 
15 expressed as normalized 0D405. 

Figure 7 is a list of known orphan protein domains (common protein 
modules). 

Figures 8A-8R show results of search (scan) of a non-redundant protein 
sequence database (Genbank) identifying protein sequences comprising the -D-X-V- 

20 COOH sequence where D is Asp, X is any of the 20 common amino acids, and V is 
Val. Identified protein sequences are listed in bold script and are grouped according 
to species (human, mouse, rat, etc.). Various descriptors accompany each identified 
protein sequence in accord with nomenclature adopted by Genbank. 
DETAILED DESCRIPTION OF THE INVENTION 

25 Unless defined otherwise, all technical and scientific terms used herein have 

the same meaning as commonly understood by one of ordinary skill in the art to 
which this invention belongs. Although any methods and materials similar or 
equivalent to those described herein can be used in the practice or testing of the 
present invention, the preferred methods and materials are described. For purposes of 

30 the present invention, the following terms are defined below. 

In the polypeptide notation used herein, the left-hand direction is the amino 
terminal direction and the right-hand direction is the carboxy-terminal direction, in 
accordance with standard usage and convention. Similarly, unless specified 
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otherwise, the left-hand end of single-stranded polynucleotide sequences is-the 5' end; 
the left-hand direction of double-stranded polynucleotide sequences is referred to as 
the 5' direction. The direction of 5* to 3 f addition of nascent RNA transcripts is 
referred to as the transcription direction; sequence regions on the DNA strand having 
5 the same sequence as the RNA and which are 5 1 to the 5' end of the RNA transcript 
are referred to as "upstream sequences"; sequence regions on the DNA strand having 
the same sequence as the RNA and which are 3' to the 3' end of the RNA transcript 
are referred to as "downstream sequences". 

The term "protein interaction inhibitor" is used herein to refer to an agent 

1 0 which is identified by one or more screening method(s) of the invention as an agent 
which selectively inhibits protein-protein binding between a first interacting 
polypeptide and a second interacting polypeptide. Some protein interaction inhibitors 
may have therapeutic potential as drugs for human use and/or may serve as 
commercial reagents for laboratory research or bioprocess control. Protein interaction 

1 5 inhibitors which are candidate drugs are then tested further for activity in assays 

which are routinely used to predict suitability for use as human and veterinary drugs, 
including in vivo administration to non-human animals and often including 
administration to human in approved clinical trials. 

As used herein, the term "operably linked" refers to a linkage of 

20 polynucleotide elements in a functional relationship. A nucleic acid is "operably 
linked" when it is placed into a functional relationship with another nucleic acid 
sequence. For instance, a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the coding sequence. 

Operably linked means that the DNA sequences being linked are typically 

25 contiguous and, where necessary to join two protein coding regions, contiguous and in 
reading frame. However, since enhancers generally function when separated from the 
promoter by several kilobases and intronic sequences may be of variable lengths, 
some polynucleotide elements may be operably linked but not contiguous. 

As used herein, the term "orphan protein domain" refers to any domain of a 

30 protein which binds or interacts with another protein, particularly but not limited to 
PDZ domains. Orphan protein domains are typically contiguous stretches of amino 
acids that facilitate protein-protein interactions. Orphan protein domains, however, 
do include domains comprising non-contiguous stretches of amino acids that through 



WO 98/23781 



PCT/US97/21861 



-14- 

secondary and tertiary structure are brought into association to facilitate protein- 
protein interactions. Protein-protein interactions typically comprise but are not 
limited to, non-covalent bonds that account for the specificity of interaction between 
two proteins. Examples of such non-covalent bonds include van der Waals contacts, 
5 hydrogen bonds and salt bridges. Examples of known orphan protein domains are set 
forth in Figure 7. 

Preferred orphan protein domains have a length of between about 1 to 1000 
amino acids, preferably about 1 to 500 amino acids, and more preferably about 1 to 
100 amino acids. Particularly preferred orphan protein domains include more than 

10 one amino acid and are capable of specifically binding a peptide ligand with a binding 
affinity (EC so) of between about 0.001 to 100 \iM, preferably 0.2 to l^iM and more 
preferably 8 to 100 nM as defined by any suitable immunological assay such as 
Western blotting, ELIS A, RIA, gel mobility shift assay, enzyme immunoassay, 
competitive assays, saturation assays or other suitable protein binding assays known 

15 in the field and specified below. See generally Ausubel et al., Current Protocols in 
Molecular Biology, John Wiley & Sons, New York (1989), Sambrook et al. infra, and 
Harlow and Lane Antibodies: A Laboratory Manual, CSH Publications, N.Y. (1988), 
for disclosure relating to suitable methods for detecting specific binding between 
proteins. 

20 A "DNA binding protein" as the term is used herein, refers to a protein that 

specifically binds a DNA strand and preferably two DNA strands of the recombinant 
DNA vector. More preferably, the DNA binding protein specifically binds to the 
specific DNA sequence included in the vector. In embodiments of the invention in 
which RNA vectors are used, DNA binding protein can also refer to an RNA binding 

25 protein. 

Suitable DNA binding proteins are known in the field. For example, suitable 
prokaryotic DNA binding proteins include lac repressor, phage 434 repressor, lambda 
phage cl and cro repressors, phage P22 Arc and Mnt repressors, and CAP protein. 
Aiso included are eukaryotic DNA binding proteins such as those comprising 
30 homoeoboxes with helix-tum-helix motifs, proteins including helix-loop-helix 

structures particularly myc; fos, jun and other proteins including leucine zippers and 
DNA binding domains, POU domain proteins, TFIQA, and yeast Gal4 protein. 
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Preferably, the DNA binding protein is the lac repressor particularly the 37 
kDa protein encoded by E. coli lac I gene capable of repressing transcription from the 
lacZYA operon by binding to a specific DNA sequence termed lacO. See e.g., 
Aububel et al. supra; Sambrook et al., supra; Knight et al. J. Biol Chem. 264:3639- 
5 3642 (1989); Beyreuther in The Operon (Miller and Reznikoff, eds. Cold Spring 
Harbor Laboratory (1980)). 

A "host cell" as the term is used herein is a eukaryotic or prokaryotic cell or 
cell group that is capable of being transformed by a recombinant DNA vector. 
Preferably, the host cell is a suitable bacterial strain such as E. coli K12. 

10 A "peptide ligand" refers to a molecule and particularly a peptide such as a 

random peptide that is capable of being specifically bound by an immobilized orphan 
protein domain. In addition, the peptide ligand is capable of being bound by the 
orphan protein domain as it exists in a protein. Preferably, the binding affinity (EC 50) 
between the peptide ligand and the immobilized orphan protein domain is between 

1 5 about 0.001 to 100 jiM, preferably 0.2 to 1 |iM and more preferably 8 to 100 nM as 
determined by a suitable binding assay as described herein. 

By the term "specific binding" or similar term is meant a molecule disclosed 
herein which binds another molecule, thereby forming a specific binding pair, but 
which does not recognize and bind to other molecules as determined by, e.g., Western 

20 blotting, ELISA, RIA, gel mobility shift assay, enzyme immunoassay, competitive 
assays, saturation assays or other suitable protein binding assays known in the field- 
By the term "immobilized orphan protein domain" is meant an amino acid 
sequence corresponding to a desired orphan protein domain that has been covalently 
or non-co valently bound to a solid support or surface such as a particle or a dish. If 

25 desired the immobilized orphan protein can be immobilized by attaching an 

immunologically recognizable ligand, e.g., biotin, bound to streptavidin which is 
attached to the solid support or surface. If desired, the ligand may be attached by a 
peptide linker sequence. 

Exemplary peptide linker sequences in accord with the invention comprise up 

30 to 20 amino acids, preferably up to about 10 amino acids, and more preferably from 
about 1 to 5 amino acids. The linker sequence is generally flexible so as not hold the 
random peptide in a single rigid conformation. The linker sequence can be used, e.g., 
to space the DNA binding protein from the fused random peptide sequence. 
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Preferably, the orphan protein domain will be between about 1000, preferably 
500 and more preferably 100 amino acids in length. It is also preferred that the 
orphan protein domain be immobilized on a solid support or surface which is 
conducive to standard affinity panning (i.e. biopanning or panning) techniques 
5 capable of detecting nanomolar binding affinities between proteins. A preferred solid 
support is a microtitre dish. 

The term "random peptide" refers to an amino acid oligomer comprising two 
or more amino acid residues that have been constructed by a recognized stochastic or 
random process. A "random peptide library" refers not only to a set of recombinant 
1 0 DNA vectors that encodes a set of random peptides, but also to the set of random 
encoded by those vectors, as well as the fusion proteins containing those random 
peptides. 

The Protein Interaction Network (PIN) is generally applicable to identifying 
the amino acid sequences which interact with a given orphan protein domain. 

IS A PIN in accord with the invention can be assembled and then stored in a 

variety of ways. For example, a desired PIN can be assembled and stored by use of a 
computer program such as Netscape and particularly a Netscape assisted program. 
The program can be run (i.e. performed) on any suitable computer such as an PC 
(IBM) or Macintosh (Apple) computer. A preferred PIN includes between about 100 

20 to 10 13 , preferably about 1000 to 10 12 , and more preferably about 10 12 peptide ligand 
sequences. 

Once assembled, the PIN of interest can be further assembled into a Super 
Protein Interaction Network (SPIN) by use of a computer program such as BLAST 
run on, e.g., a conventional central server system. The size of the SPIN will depend 

25 on several parameters such as the complexity of the PIN assembly and desired 
electronic connections with other database networks. In general, the SPIN will 
include between about 5 to 10 8 , preferably 500 to 10 8 , and more preferably 500 to 10 7 
PINs. Compilation and analysis of multiple PINs is facilitated by any number of 
stand alone computer-assisted programs particularly BLAST and other secondary 

30 sequence computer programs known in the field. 

The present invention is based on the discovery that a random fusion protein 
library wherein random peptides are fused to the C-terminus of a bacterial DNA 
binding protein such as a transcriptional repressor can be used to select for specific 
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peptide ligands that bind to a given orphan domain. The gene encoding the fusion 
protein is operably linked on a plasmid to the fusion protein's binding site. Following 
expression or induction of the election of the fusion protein in a transformed or 
transfected host cell, the fusion protein binds to its cognate binding sequence on the 
5 plasmid. This linkage of the fusion protein to the plasmid which itself encodes the 
fusion protein allows for repeated rounds of selection for specific peptide ligands in 
the library by affinity purification of fusion protein-plasmid complexes using an 
orphan domain of interest. The plasmid can then be dissociated from the complex and 
used to retransform appropriate host cells for another round of selection. 

1 0 Generally, the nomenclature used hereafter and the laboratory procedures in 

cell culture, molecular genetics 1 and nucleic acid chemistry and cell culture described 
below are those well known and commonly employed in the art. Standard techniques 
are used for recombinant nucleic acid methods, polynucleotide synthesis, and 
microbial culture and transformation (e.g., electroporation, lipofection). Generally 

1 5 enzymatic reactions and purification steps are performed according to the 
manufacturer's specifications. The techniques and procedures are generally 
performed according to conventional methods in the art and various general 
references (see, generally, Sambrook et al, Molecular Cloning: A Laboratory Manual, 
2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., which 

20 is incorporated herein by reference) which are provided throughout this document. 
The procedures therein are believed to be well known in the art and are provided for 
the convenience of the reader. All the information contained therein is incorporated 
herein by reference. 

General methods for assembling amino acid and nucleic acid sequence data in 

25 accord with the methods described herein have been disclosed. See S. Altschul et al. 
7. Mol Biol, 215:403-410 (1990); and S. Altschul et al. Nuc. Acids Res., 25:3389- 
3402 (1997) for disclosure relating to the BLAST, particularly gapped BLAST, and 
PSI-BLAST computer programs the disclosures of which are fully incorporated herein 
by reference. 

30 Peptides of the invention comprising those that bind to nNOS are at least 3 

amino acids long and comprise the consensus sequence Asp-X-Val. Peptides of 
longer length are also encompassed within the invention with the proviso that the 
peptide contain the consensus sequence, preferably at the C-terminal end. 
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Accordingly, peptides of at least 5 amino acids, at least 7 amino acids, at least 10 
amino acids and at least 15 or more amino acids are encompassed. 

The peptides of the invention may be prepared by recombinant nucleotide 
expression techniques or by chemical synthesis using standard peptide synthesis 
5 techniques. For example, peptides of the invention can be produced, for example, by 
expressing cloned nucleotide sequences. Alternatively, peptides of the invention can 
be generated directly from intact protein products. Peptides can be specifically 
cleaved by proteolytic enzymes, including, but not limited to, trypsin, chymotrypsin 
or pepsin. Each of these enzymes is specific for the type of peptide bond it attacks. 

10 Trypsin catalyzes the hydrolysis of peptide bonds whose carbonyl group is from a 
basic amino acid, usually arginine or lysine. Pepsin and chymotrypsin catalyze the 
hydrolysis of peptide bonds from aromatic amino acids, particularly tryptophan, 
tyrosine and phenylalanine. Alternate sets of cleaved peptide fragments are generated 
by preventing cleavage at a site which is susceptible to a proteolytic enzyme. For 

1 5 example, reaction of the epsilon -amino groups of lysine with ethyltrifluorothioacetate 
in mildly basic solution yields a blocked amino acid residue whose adjacent peptide 
bond is no longer susceptible to hydrolysis by trypsin (Goldberger et al., Biochem., 
1:401 (1962)). 

Peptides of the invention also can be modified to create peptide linkages that 
20 are susceptible to proteolytic enzyme catalyzed hydrolysis. For example, alkylation 
of cysteine residues with beta -halo ethylamines yields peptide linkages that are 
hydrolyzed by trypsin (Lindley, Nature, 178:647 (1956)). In addition, chemical 
reagents that cleave peptide chains at specific residues can be used (Withcop, Adv. 
Protein Chern., 16:221 (1961)). For example, cyanogen bromide cleaves peptides at 
25 methionine residues (Gross et al., J. Am Chem Soc. , 83:1510 (1961)). Thus, by 
treating full-length proteins with various combinations of modifiers, proteolytic 
enzymes and/or chemical reagents, numerous discrete overlapping peptides of varying 
sizes are generated. These peptide fragments can be isolated and purified from such 
digests by chromatographic methods. 
30 Most preferably, isolated peptides of the present invention can be synthesized 

using an appropriate solid state synthetic procedure (Steward and Young, Solid Phase 
Peptide Synthesis, Freemantle, San Francisco, Calif. (1968)). A preferred method is 
the Merrifield process (Merrifield, Recent Progress in Hormone Res., 23:451 (1967)). 
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The binding activity of these peptides may conveniently be tested using, for example, 
the assays as described herein. 

Once an isolated peptide of the invention is obtained, it may be purified by 
standard methods including chromatography (e.g., ion exchange, affinity, and sizing 
5 column chromatography), centrifugation, differential solubility, or by any other 
standard technique for protein purification. For immunoaffinity chromatography, a 
peptide may be isolated by binding it to an affinity column comprising antibodies that 
were raised against that peptide, or a related peptide of the invention, and were affixed 
to a stationary support. Alternatively, affinity tags such as hexa-His (Invitrogen), 

10 Maltose binding domain (New England Biolabs, Inc.), influenza coat sequence 
(Kolodziej et al., Methods EnzymoL, 194:508-509 (1991)), and glutathione-S- 
transferase can be attached to the peptides of the invention to allow easy purification 
by passage over an appropriate affinity column. A DNA affinity column using DNA 
containing a sequence encoding the peptides of the invention could be used in 

15 purification. 

Isolated peptides can also be physically characterized using such techniques as 
proteolysis, nuclear magnetic resonance, and x-ray crystallography. 

With regard to nucleic acid sequences of the present invention, "isolated" 
means: an RNA or DNA polymer, portion of genomic nucleic acid, cDNA, or 
20 synthetic nucleic acid which, by virtue of its origin or manipulation: 

(i) is not associated with all of a nucleic acid with which it is associated in 
nature (e.g. is present in a host cell as a portion of an expression vector); or 

(ii) is linked to a nucleic acid or other chemical moiety other than that to 
which it is linked in nature; or 

25 (iii) does not occur in nature. 

By "isolated" it is further meant a nucleic acid sequence: 

(i) amplified in vitro by, for example, polymerase chain reaction (PCR); 

(ii) synthesized by, for example, chemical synthesis; 

(iii) recombinantly produced by cloning; or 

30 (iv) purified, as by cleavage and gel separation. 

The nucleic acid sequences of the present invention may be characterized, 
isolated, synthesized and purified using no more than ordinary skill. See Sambrook et 
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al., Molecular Cloning, Cold Spring Harbor Press, New York, 1989,incorporated 
herein by reference. 

Due to the degeneracy of nucleotide coding sequences (see Alberts et al., 
Molecular Biology of the Cell, Garland Publishing, New York and London, 1989- 
5 page 103, incorporated herein by reference), a number of different nucleic acid 

sequences may be used in the practice of the present invention. These include, but are 
not limited to, sequences encoding the peptides of Figure 3B and 3C. This includes 
the substitution of different codons encoding the same amino acid residue within the 
sequence, thus producing a silent change. Almost every amino acid except tryptophan 

10 and methionine is represented by several codons. Often the base in the third position 
of a codon is not significant, because those amino acids having 4 different codons 
differ only in the third base. This feature, together with a tendency for similar amino 
acids to be represented by related codons, increases the probability that a single, 
random base change will result in no amino acid substitution or in one involving an 

15 amino acid of similar character. 

The nucleotide sequences of the invention can be altered by mutations such as 
substitutions, additions or deletions that provide for functionally equivalent nucleic 
acid sequence. In particular, a given nucleotide sequence can be mutated in vitro or in 
vivo, to create variations in coding regions and/or to form new restriction 

20 endonuclease sites or destroy preexisting ones and thereby to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art can be used including, 
but not limited to, in vitro site-directed mutagenesis (Hutchinson et al., J. Biol Chem., 
253:6551 (1978)), use of TAB Registered TM linkers (Pharmacia), PCR-directed 
mutagenesis, and the like. The functional equivalence of such mutagenized 

25 sequences, as compared with unmutagenized sequences, can be empirically 
determined by comparisons of structural and/or functional characteristics. 

The isolated nucleotide sequences of the invention may be cloned or 
subcloned using any method known in the art (See, for example, Sambrook, J. et al., 
Molecular Cloning, Cold Spring Harbor Press, New York, 1989), the entire contents 

30 of which are incorporated herein by reference. In particular, nucleotide sequences of 
the invention may be cloned into any of a large variety of vectors. Possible vectors 
include, but are not limited to, cosmids, plasmids or modified viruses, although the 
vector system must be compatible with the host cell used. Viral vectors include, but 
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are not limited to, lambda, simian virus, bovine papillomavirus, Epstein-Barr virus, 
and vaccinia virus. Viral vectors also include retroviral vectors, such as 
Amphatrophic Murine Retrovirus (see Miller et al., Biotechniques, 7:980-990 (1984)), 
incorporated herein by reference). Plasmids include, but are not limited to, pBR, 
5 PUC, pGEM (Promega), and Bluescript Registered TM (Stratagene) plasmid 
derivatives. Introduction into and expression in host cells is done for example by, 
transformation, transfection, infection, electroporation, etc. 

Examples of DNA vectors for constructing random peptide libraries, methods 
of making same, and useful related materials and methods have been disclosed in U.S. 
10 Pat. Nos. 5,270,170 and 5.498,530, the disclosures of which are incorporated herein 
by reference. 

The peptides described herein can be used in pharmaceutical compositions to 
alter the binding of the nNOS PDZ domain and the proteins which this domain 
interacts. The peptides preferably alter the interactions between the nNOS PDZ 

15 domain and melatonin or non-NMDA type glutamate receptors. An exemplary 
pharmaceutical composition is a therapeutically effective amount of one of the 
disclosed peptides optionally included in a pharmaceutically-acceptable and 
compatible carrier. The term "pharmaceutically-acceptable and compatible carrier" as 
used herein, and described more fully below, refers to one or more compatible solid or 

20 liquid filler diluents or encapsulating substances that are suitable for administration to 
a human or other animal. In the present invention, the term "carrier" thus denotes an 
organic or inorganic ingredient, natural or synthetic, with which the peptides of the 
invention are combined to facilitate administration. 

Peptides of the invention can be stabilized to decrease protease sensitivity 

25 and/or increase in vivo half-life by methods known in the art. For instance, peptides 
of the invention can be modified by the addition of a N or C terminal tail, modified by 
the methylation or glyoxylation of the termini or by substitution or other modification 
to the sequence to increase the peptide half-life, stability, and/or protease resistance. 
In some embodiments, the peptides are conformationally restricted such as 

30 those which are cyclicized, circularized or otherwise restricted by peptide and/or non- 
peptide bonds to limit conformational variation and/or to increase stability and/or 
half-life of the peptides. In some embodiments, peptides are provided as linear 
peptides. 
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In some embodiments, peptides of the present invention comprise one or more 
D amino acids. As used herein, the term "D amino acid peptides" is meant to refer to 
peptides according to the present invention which comprise at least one and preferably 
a plurality of D amino acids. D amino acid peptides consist of 4-25 amino acids. D 
5 amino acid peptides retain the biological activity of the peptides of the invention that 
consist of L amino acids, i.e. D amino acid peptides inhibit the interaction of nNOS 
and the proteins which bind to nNOS. In some embodiments, the use of D amino acid 
peptides is desirable as they are less vulnerable to degradation and therefore have a 
longer half life. D amino acid peptides comprising mostly all D amino acids or D 

1 0 amino acid peptides that consist of only D amino acids may comprise amino acid 
sequences in the reverse order of amino acid sequences of peptides. 

The term "therapeutically-effective amount" is that amount of the present 
pharmaceutical compositions which produces a desired result or exerts a desired 
influence on the particular condition being treated. Various concentrations may be 

1 5 used in preparing compositions incorporating the same ingredient to provide for 
variations in the age of the patient to be treated, the severity of the condition, the 
duration of the treatment and the mode of administration. 

The term "compatible" as used herein, means that the components of the 
pharmaceutical compositions are capable of being commingled with the peptides of 

20 the present invention, and with each other, in a manner such that there is no 
interaction that would substantially impair the desired pharmaceutical efficacy. 

Dose of the pharmaceutical compositions of the invention will vary depending 
on the subject and upon particular route of administration used. By way of an 
example only, an overall dose range of from about 1 microgram to about 300 

25 micrograms or 0. 1 to 100 mg/kg/day is contemplated for human use. Pharmaceutical 
compositions of the present invention can also be administered to a subject according 
to a variety of other, well-characterized protocols. Desired time intervals for delivery 
of multiple doses of a particular composition can be determined by one of ordinary 
skill in the art employing no more than routine experimentation. 

30 The peptides of the invention may also be administered per se (neat) or in the 

form of a pharmaceutically acceptable salt. When used in medicine, the salts should 
be pharmaceutically acceptable but non-pharmaceutically acceptable salts may 
conveniently be used to prepare pharmaceutically acceptable salts thereof and are not 
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excluded from the scope of this invention. Such pharmaceutical^ acceptable salts 
include, but are not limited to, those prepared from the following acids: hydrochloric, 
hydrobromic, sulphuric, nitric, phosphoric, maleic, acetic, salicylic, p-toluene- 
sulfonic, tartaric, citric, methanesulphonic, formic, malonic, succinic, naphthalene-2- 
5 sulfonic, and benzenesulphonic. Also, pharmaceutical^ acceptable salts can be 
prepared as alkaline metal or alkaline earth salts, such as sodium, potassium or 
calcium salts of the carboxylic acid group. 

The compositions include those suitable for oral, rectal, topical, nasal, 
ophthalmic or parenteral administration, all of which may be used as routes of 

10 administration using the materials of the present invention. Other suitable routes of 
administration include intrathecal administration directly into spinal fluid (CSF), 
direct injection onto an arterial surface and intraparenchymal injection directly into 
targeted areas of an organ. Compositions suitable for parenteral administration are 
preferred. The term "parenteral" includes subcutaneous injections, intravenous, 

1 5 intramuscular, intrasternal injection or infusion techniques. 

The compositions may conveniently be presented in unit dosage form and may 
be prepared by any of the methods well known in the art of pharmacy. All methods 
include the step of bringing the active ingredients of the invention into association 
with a carrier which constitutes one or more accessory ingredients. 

20 Compositions of the present invention suitable for oral administration may be 

presented as discrete units such as capsules, cachets, tablets or lozenges, each 
containing a predetermined amount of the peptides of the invention or as a suspension 
in an aqueous liquor or non-aqueous liquid such as a syrup, an elixir, or an emulsion. 
Preferred compositions suitable for parenteral administration conveniently 

25 comprise a sterile aqueous preparation of peptides of the invention which is preferably 
isotonic with the blood of the recipient. This aqueous preparation may be formulated 
according to known methods using those suitable dispersing or wetting agents and 
suspending agents. The sterile injectable preparation may also be a sterile injectable 
solution or suspension in a non-toxic parenterally-acceptable diluent or solvent, for 

30 example as a solution in 1,3-butane diol. Among the acceptable vehicles and solvents 
that may be employed are water, Ringer's solution and isotonic sodium chloride 
solution. In addition, sterile, fixed oils are conventionally employed as a solvent or 
suspending medium. For this purpose any bland fixed oil may be employed including 



WO 98/23781 



PCT/US97/21861 



-24- 

synthetic mono-or diglycerides. In addition, fatty acids such as oleic acid find use in 
the preparation of injectables. 

The following non-limiting examples are illustrative of the invention. 
General Comments 

5 The following laboratory procedures were used in the examples below. 

1. Fusion Protein Expression and Purification 

GST-fusion proteins were expressed in either DH5a or BL21 bacterial strains. 
Cultures with an ODeoo of 0.2 were induced for three hours with isopropyl p-D- 
thiogalactopyranoside (IPTG). Bacteria were harvested by centrifugation and 

10 resuspended in 10 mL of NETN buffer which contains 20 mM 

tris(hydroxymethyl)aminomethane (Tris), pH 8.0, 100 mM NaCl, 1 nM 
ethylenediamine tetraacetic acid (EDTA), 0.5% NP-40, and 2 mM 
phenylmethylsulfonyl fluoride (PMSF). The bacterial cells were lysed by sonication. 
Affinity purification using glutathione-sepharose beads was carried out according to 

1 5 protocols provided by the manufacturer (Pharmacia Biotech Inc., Uppsala, Sweden. 

Fusion proteins can also be prepared using other fusion protein systems known 
in the art including those set forth in U.S. Patents 5,270,170 and 5,498,530, both of 
which are herein incorporated by reference. 

2. Library Construction 

20 The random 1 5-mer library was constructed as described in detail by P. Schatz 

et al., Meth. EnzymoL, 267:171-191 (1996), which is herein incorporated by reference, 
using an oligonucleotide with a degenerate region of 15 codons in the form of NNK, 
where N denotes an equimolar mix of all four bases and K denotes a mix of G or T. 
The library consisted of 1.3 x 10 10 independent recombinants. The amplified library 

25 were stored at -80°C in HEK buffer containing 35 mM HEPES pH 7.5, 0. 1 mM 
EDTA, and 50 mM KC1. 

Random peptide libraries may also be constructed using other DNA binding 
protein/specific binding site systems such as those disclosed in U.S. Patent Nos. 
5,498,530 and 5,270,170, each of which is herein incorporated by reference. 

30 3. Construction of maltose binding protein fusions 

Nucleotide sequences encoding appropriate peptides were cloned into pELM3 
(P. Schatz et al., Meth. Enzymol, 267:171-191 (1996)). This allows expression of the 
corresponding maltose binding protein/peptide fusion. The procedure for expression 
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of maltose binding proteins was identical to that for GST fusions except that the LB 
medium was supplemented with 2% glucose. 
4. Affinity Panning 

A 2 ml aliquot of thawed bacterial cells in HEK was added to 6 ml of lysis 
5 buffer 25 mM HEPES pH 7.5, 0.07 mM EDTA, 8.3% glycerol, 1 .25 mg/ml bovine 
serum albumin (BSA), 0.83 mM DTT, 0.2 mM PMSF. The bacteria were lysed for 2 
to 4 mm on ice by the addition of 0.15 ml 10 mg/ml lysozyme (Boehringer 
Mannheim, Indianapolis, IN) and then 2 ml of 20% lactose and 0.25 ml of 2 M KC1 
were added. The supernatant was obtained after a 15 mm centrifugation at 27,000 x 

10 g. To initiate panning, 12 wells of a 96-weil plate were first coated with GST-fusion 
proteins (10 jag protein per well) at 4°C for 1 hour. The wells were then blocked with 
1% BSA in phosphate-saline buffer (PBS) at pH 7.4. After precoating, 250 |il of the 
supernatant was added to each of precoated wells. After gentle agitation for 1 hour at 
4°C, the unbound material was recovered and the wells were then washed with a 

15 series of solutions: 5 times with HEK buffer supplemented with 0.2M lactose and 1% 
BSA, twice with HEK supplemented with 0.2 M lactose, and twice with HEK at 4°C. 
The bound plasmids were eluted with 35 mM HEPES, pH 7.5, 0.1 mM EDTA, 200 
mM KC1, 1 mM IPTG for 30 mm at room temperature. The eluted DNA was 
precipitated with isopropanol and amplified by electrotransformation. This pool of 

20 bacterial transformants were used in subsequent rounds of panning. 

The panning procedure was monitored by two parameters: recovery and 
enrichment. Recovery was calculated by subtracting the number of plasmids bound to 
receptor/BSA-coated wells by number of plasmids bound to BSA-coated wells. The 
enrichment at each round of panning was the ratio of recovered plasmids from 

25 receptor coated wells to those recovered from BSA coated wells. The details of one 
affinity panning using PDZ3 of PSD-95 is shown: 



Round No. 


Input 


Output 


Recovery 


Enrichment 


1 


6.0 x 10 9 


1.72 x 10 s 


2.9 x 10- 5 




2 


3.2 x 10 9 


1.4 xlO 5 


4.4 xlO" 5 


3 


3 


1.2 x 10 8 


1.1 x 10 6 


5.9 xlO" 3 


270 


4 


8.4 x 10 7 


4.8 x 10 6 


5.7 x 10" 2 


1,700 
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5. ELISA 

After three to four rounds of affinity panning, individual colonies were 
randomly selected. Overnight cultures from single colonies were diluted 1 : 10 in 3 ml 
of LB ampicillin (100 ^g/ml) and grown 1 hour at 37°C. The expression of the LacI- 
5 peptide fusions was induced by the addition of arabinose to 0.2% for 3 hours. After 
induction, the cells were pelleted by centrifugation and lysed as described above in 1 
ml of lysis buffer plus lysozyme. The clarified lysates were used immediately for 
ELISA or stored at -70°C. To prepare ELISA, 96-well plates were first coated with 
GST- fusion proteins (0.2 \ig protein per well) of nNOS, PSD-95, or disheveled PDZ 

10 domain at 4°C for 1 hour. The wells were then blocked with 1% BSA in phosphate- 
saline buffer (PBS) at pH 7.4. After precoating, the wells were washed three times 
with PBS supplemented with 0.05% Tween-20 (PBT). To initiate the binding, 100 jil 
of 1 : 10 diluted lysate was added to each well. After 30 minutes at 4°C, the plate was 
washed four times with PBT. The binding of Lacl-peptide was detected using rabbit 

1 5 anti-Lac I antibody. After 4 washes with PBT, the plate was developed by adding 
alkaline phosphatase-conjugated goat anti-rabbit antibody (GIBCO-BRL, 
Gaithersburg, MD) in PBS/0.1% BSA (100 jil per well for 1 hour at 25°C) followed 
by a 6 mm treatment with p-nitrophenyl phosphate (4 mg/ml) in 1 M diethanolamine 
hydrochloride, pH 9.8/0.24 mM MgCk (200 p.1 per well). Binding was quantified by 

20 monitoring optical density (O.D.) at 405 nm on an E-max plate reader (Molecular 

Devices Inc., Melno Park, CA). The negative controls were wells coated with control 
GST fusion or as otherwise indicated. All experiments were repeated at least once 
with similar results. 

ELISAs for maltose binding fusion proteins were performed as described 

25 above with a few modifications. 1 00 \i 1 of a 1 :50 dilution of crude lysate was added 
to each well All buffers were the same but were supplemented with 1 mM maltose to 
minimize oligomerization of maltose binding protein fusions (G. Richarme, 
Biochemical and Biophysical Research Communications 105:476-481 (1982)). 
Interaction of maltose binding protein fusion proteins with immobilized GST-fusion 

30 proteins was monitored by rabbit anti-maltose binding protein antibody (1 : 10,000 
dilution, New England Biolabs, Inc., Beverly, MA). 
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6. Peptide-PDZ binding 

To determine the affinity of peptide-PDZ interactions, monomeric maltose 
binding protein fusions of peptides were purified by amylose affinity columns 
according to a protocol provided by the manufacturer (New England Biolabs, Inc., 
5 Beverly, MA). Protein concentration was determined by the Bradford assay (BioRad, 
Richmond, CA) using BSA as standard. The effective concentration, i.e., EC 50 was 
determined by dose dependent ELISA tests. GST fusion was bound at 0.05 p.g per 
well. The maltose binding protein fusions were incubated after being serially diluted 
(1:5) starting at 15 jiM. The data were fit with the Hill equation 
10 (O.D.405=O.D.405Max/l+ {ECso/[x]} n). A non-linear least square algorithm was used. 

7. Yeast Two Hybrid Analysis 

Yeast Yl 87 cells were co-transformed with expression vectors encoding 
various Gal4 DNA binding domain-nNOS fusions and the Gal4 activation domain 
fused to PSD-93 (amino acids 1 16-421). Each transformation mixture was plated on 
1 5 synthetic dextrose plates lacking tryptophan and leucine. Interaction was measured 
by the liquid culture p-galactosidase assay as described (S. Fields et al., Nature, 
340:245-246 (1989); and Song, 1989; Clonetech, Palo Alto, CA)). Values are 
representative of duplicate experiments. 

8. Fusion Protein Affinity Chromatography 

20 Rat whole brain was homogenized in 10 volumes (w/v) tris-HCl buffer pH 7.4 

and centrifuged at 32,000 x g for 20 minutes. Membranes were solubilized for 2 
hours at 4°C in buffer containing 200 mM NaCl and 1% Triton X-100 and insoluble 
material pelleted by centrifugation at 100,000 x g for 30 minutes. Extracts were 
incubated with control amylose beads or amylose beads saturated with maltose- 

25 binding fusion proteins as indicated. Samples were loaded into disposable columns, 
which were washed with 50 volumes of buffer containing 1% Triton X-100 + 300 
mM NaCl. Retained proteins were eluted with 150 |xl of loading buffer and were 
resolved by SDS / PAGE. Blots were hybridized with a monoclonal antibody to 
nNOS (Transduction Labs, Lexington, KY). 

30 Example 1 - Construction of a random C-terminal random C-terminal 

peptide library 

Peptide binding and x-ray crystallographic studies of PSD-95 indicate that 
specificity of the peptide-PDZ interaction is primarily determined by the final 4 
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residues of the peptide ligand (D. Doyle et aL, Cell, 85:1067-1076 (1996); E. Kim et 
al., Nature, 378:85-88 (1995); H. Komau et al., Science, 269:1737-1740 (1995); B. 
Muller et al., Neuron, 17:255-265 (1996); M. Niethammer et aL, J. NeuroscL, 
16:2157-2163 (1996)). To determine optimal peptide binding ligands for other PDZ 
5 domains, we constructed a fusion protein library that contains 1 5 randomized residues 
at the C-terminus. In this library, a degenerate oligonucleotide encoding the random 
peptides is fused to the end of the E.coli lac repressor (M. Cull et al., Proc. Natl ScL 
USA, 89:1865-1869 (1992)), which is herein incorporated by reference. Following 
expression 1 the Lac repressor protein binds to the lac operator sequence on the same 

1 0 plasmid linking each randomized 1 5-mer peptide to the plasmid encoding that peptide 
(Figure 1). This linkage allows repeated rounds of selection for specific peptide 
ligands in the population by affinity purification of peptide-repressor-plasmid 
complexes (see the experimental procedures set forth above). 

In vitro selection of optimal binding peptides for PDZ domains 

15 A random 15-mer peptide library using the third PDZ (PDZ3) domain of PSD- 

95 was screened according to the following steps. Step I. A pool of oligonucleotides 
encoding 15 random amino acids (X\$) was cloned in frame C-terminal to lac L 
Protein expression from each plasmid of the library yields a Lac I fusion with a 
distinct peptide sequence. The recombinant Lac I binds the lac 0 sites present on the 

20 same plasmid yielding Lac I-p 1 asmid complexes that are purified from the E. coli. 
Step II. Affinity panning selects peptides that interact with target receptorl e.g., PDZ 
domain. Step HL The bound plasmid DNA can be specifically recovered by addition 
of IPTG. Step IV. The recovered plasmids are retransformed, amplified, and used for 
subsequent rounds of panning. 

25 In PSD-95, PDZ1 and PDZ2 domains interact with the C- terminal four amino 

acids found in Shaker potassium channels and NMDA receptor subunits (H. Kornau 
et al., Science, 269:1737-1740 (1995); E. Kim et al., Nature, 378:85-88 (1995)), 
which have a shared consensus of E-(T/S)-X-V-COOH. PDZ3 binds to an identical 
sequence (D. Doyle et al., Cell, 85:1067-1076 (1996)). A PDZ3 fusion protein was 

30 constructed by linking amino acids 302-402 of PSD-95 to the C-terminus of 

glutathione S-transferase (GST). The purified protein was incubated with a 15-mer 
lac I library with a complexity of 1.3 x 10 10 . After 4 rounds of panning selection, a 
1,700-fold enrichment of interacting peptides was achieved (see Experimental 
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procedures). At this stage, individual clones were randomly selected and subjected to 
ELISA analysis (Figure 2A). 

Briefly, crude bacterial lysates from individual clones (horizontal axis of 
Figure 2A) selected through four rounds of panning were prepared (see Experimental 
5 procedures). Association of Lac 1 -peptide fusion with GST-PDZ3 was determined by 
ELISA. Dashed bars indicate wells coated with BSA only; gray bars: GST-NABherg 
+ BSA; open bars: GST-nNOS-PDZ + BSA; closed bars: GST-PDZ3 + BSA. GST- 
NABherg is a fusion protein containing amino acids 1-135 from HERG potassium 
channel which has no homology with PDZ domain (X. Li et al., J. Biol Chem., 

10 272(2):705-708 (1997)). All ELISA experiments in this figure and subsequent figures 
have been repeated at least once with similar results. 

Enriched clones were divided into two classes. One class, such as PD-301, 
PD-302, and PD-304, interacted with both GST control and GST-PDZ3 fusion 
(Figure 2A), suggesting that the corresponding peptides interact with GST. The other 

15 class of clones, including PD-312, PD-314, and PD-315, bound selectively to GST- 
PDZ3. Affinity of interaction (EC50) was 2 to 100 nM as determined by quantitative 
ELISA as set forth above. 

To determine the binding specificity 1 purified recombinant PDZ fusion 
proteins of nNOS (amino acids 1-150, D. Bredt et al., Nature, 351:714-718 (1991)) 

20 and disheveled (amino acids 1 46-226; J. Klingensmith et al., Genes Dev. ,8:118-130 
(1994)) were also tested for peptide-binding. Under the same conditions, the PDZ3- 
positive clones failed to interact with the PDZ domain of nNOS (Figure 2A) or with 
the PDZ domain of disheveled. Plasmids encoding PDZ3-specific clones were 
sequenced. 

25 An alignment of the deduced amino acid sequences is shown (Figure 2B). 

Indeed, most of the interacting peptides closely resemble the peptide sequence at the 
C-terminus of Shaker-like potassium channels and NMDA receptor subunits, with a 
consensus of E-(T/S)-X-V-COOH. 

Identification of novel peptides interacting with PDZ domain of nNOS 

30 To determine optimal peptide ligands for the nNOS PDZ domain, a 

recombinant GST fusion protein corresponding to the coding sequence of amino acids 
1 to 150 of nNOS (nNOS-PDZ) was used for peptide selection. After four rounds of 
panningl a 2,300-fold enrichment was achieved. Individual GST-nNOS-PDZ 
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specific clones were identified by ELIS A (Figure 3 A). It was discovered that 95 out 
of 150 clones specifically interacted with nNOS-PDZ but not with the control GST 
fusion protein. Binding affinity of these peptides to immobilized nNOS-PDZ (EC50) 
was 8 to 100 nM. Plasmids from these nNOS specific clones were sequenced. The 
5 deduced amino acid sequences of 95 independent clones were aligned via their C- 
termini (Figures 3B and 3C). 

An analysis of amino acid abundance at each position indicates that valine 
again is strongly preferred (89%) at the 0 position (Figures 4A-4I). At the -1 position, 
there is no obvious preference. Fifteen of the twenty amino acids were found - amino 

10 acids D, E, H, K and N were not present. In contrast to the PDZ3 consensus, aspartate 
at the -2 position was present in 81% of all nNOS-PDZ binding peptides. At the -3 
position, glycine is significantly preferred. Considering that glycine was used as a 
part of the linker that separates Lac I from the random peptide (Figure 1), this bias 
was appropriately corrected. The corrected glycine abundance is 47% at the -3 

1 5 position. From position -4 to position -8, no obvious amino acid preference was 
observed (Figures 4A-4I). Based on the amino acid abundance at each position, the 
optimal sequence for a nNOS binding peptide (NBP) is g-D-X-V-COOH. 
SPECIFICITY OF NBP BINDING TO NNOS-PDZ 

Figures 5A- D show that NBP's bind specifically to nNOS PDZ and native 

20 nNOS protein from rat brain. 

The in vitro peptide selection suggests that PDZ3 of PSD-95 and the nNOS- 
PDZ, despite a shared preference for valine at the 0 positionl have distinct binding 
specificity. To directly test this, we performed ELISA as set forth above and found 
that 36 randomly chosen NBPs failed to bind to PDZ3 of PSD-95 (Figure 5A) or to 

25 the PDZ domain of disheveled. Based on the peptide-PDZ3 crystal structure (D. 

Doyle et al., Cell, 85:1067-1076 (1996)), it is known that the side-chain of His372 of 
PSD-95 forms a critical sequence specific hydrogen bond with the T at the -2 position 
of the bound peptide. Interestingly, the amino acid at the corresponding position of 
nNOS-PDZ is Y77, consistent with the idea that substitution of H to Y at this position 

30 converts the -2 position peptide preference from T to D. Also in agreement with this 
notion, the corresponding residue of the disheveled PDZ is N. Amino acid sequence 
comparison of a number of PDZ domains present in Genbank shows that the residue 
after the H or Y is also conserved (nNOS is Y-D, PDZ3 is H-E). To determine 
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whether the Y77 of nNOS is critical we mutated Y77D78 to H77E78. This mutant, 
nNOS-PDZHE, lost its ability to bind D-X-V peptides and gained the ability to bind 
T-X-V peptides (Figure 5B). 

To evaluate the specificity of the NBP-nNOS interactions, we mutated the D 
5 at the -2 position of the NBP-123 (LDRLRNRVHGDAV-COOH, EC 5O =40 nM) 

peptide to A, L, Q, R, S, T, and V. Peptides with these amino acid substitutions failed 
to interact with nNOS-PDZ (Figure 5C). To test whether NBPs bind to native nNOS 
protein, we generated an affinity column linking NBP-123 to an agarose matrik (see 
the experimental procedures set forth above). We found that nNOS protein in crude 
10 rat brain homogenates adhered to the NBP-123 matrix. In contrast, nNOS did not 
bind to an analogous column in which the -2 D residue of NBP-123 was mutated to T 
(Figure 5D). 

The nNOS-PDZ Domain Has Unique Structural Feature 

Previous studies have shown that the N-terminal domain of nNOS (amino 

15 acids 1-150) binds to the PDZ domain of (1-syntrophin and to the second PDZ - 
domains of PSD-95 and PSD-93 (J. Brenman et al., Cell 84:757-767 (1996)). 
Although amino acids 16 to 100 of nNOS define the consensus PDZ domain, binding 
studies have shown that fusions containing amino acids 1 to 100 of nNOS do not bind 
to the PDZ domain of either OLl-syntrophin or PSD-93 (J. Brenman et al., Cell, 

20 84:757-767 (1996)). To test whether the peptide binding property of the nNOS-PDZ 
is confined to the typical consensus, we tested whether any of five randomly selected 
NBPs interact with a fusion protein containing nNOS 1-100. We found that all 5 
NBPs bind to nNOS (1-150) but not to nNOS (1-100). 

To determine the minimal functional structure for nNOS-PDZ to bind NBPs 

25 and PSD-93, we generated a panel of six fusion proteins that express various regions 
of the N-terminus of nNOS (Figure 6). We first evaluated binding of these constructs 
to the PDZ repeats in PSD-93 using the yeast two-hybrid analysis. Binding to PSD- 
93 required amino acids 16-130 of nNOS; truncations on either side of this core 
nNOS- PDZ eliminate the interaction. Similarly, all NBPs required amino acids 16- 

30 1 30 for binding as tested by ELIS A (Figure 6). These studies indicate that the 
functional nNOS-PDZ requires additional amino acids beyond the conserved 
consensus and indicate that both peptide-PDZ and PDZ-PDZ interactions of nNOS 
likely require a similar tertiary structure. 
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Candidate proteins that interact with nNOS 

Identification of the ligand binding consensus of nNOS- PDZ allows an 
electronic search for potential nNOS interacting proteins present in the protein 
databases. A pre-release version of the XREFPatScan software, written in the perl 
5 programming language was used to find all occurrences of the D-X-V pattern at the 
carboxy-terminus of protein sequences in the non-redundant protein database (nr, 1 1 
Nov 1996) maintained at the National Center for Biotechnology Information 
http://www.ncbi.nlm.nih.gov). This sequence pattern scan has revealed 484 matches 
in the database. Interestingly, this list of potential binding partners includes both 

10 glutamate and melatonin receptors, which are well known to influence nNOS activity. 
See Figures 8A-8R for more detailed results of the PDZ scan of the database. 

Another suitable software package is the SASP package available from GCG 
(Genetics Computer Group, University Research Park, Madison WI). 

In summation, we have employed a powerful genetic strategy to identify C- 

15 terminal peptide ligands for the nNOS PDZ domain. This strategy takes advantage of 
the strong protein-DNA association between the lac repressor and the lac operator 
sequence. This interaction is used to obtain a highly complex library of expressed 
peptides each bound to the plasmid that encodes them. By simply panning for peptide 
binding and then sequencing the corresponding plasmids, we were able to rapidly 

20 determine optimal binding partners for the nNOS-PDZ. Identified peptides bind 

potently to nNOS with binding affinities (EC50) in the 8-100 nM range, similar to the 
affinity between the NMDA receptor and PDZ domain of PSD-95 (B. Muller et al., 
Neuron, 17:255-265 (1996)). These peptide sequences are likely to be 
physiologically relevant because a similar panning procedure yielded the known 

25 peptide ligands for PDZ3 of PSD-95. 

The consensus peptide binding sequence for the nNOS-PDZ is D-X-V, which 
contrasts with the E-(T/S)-x-V found for PDZs of PSD-95 (D. Doyle et al., Cell, 
85:1067-1076 (1996); E. Kim et al., Nature, 378:85-88 (1995); H. Kornau et al., 
Science, 269:1737-1740 (1995); B. Muller et al., Neuron, 17:255-265 (1996); M. 

30 Niethammer et al., J. NeuroscL, 16:2157-2163 (1996)). Analysis of the crystal 
structure of peptide-bound PDZ3 suggests rational explanations for these alternate 
specificity (D. Doyle et al., Cell, 85:1067-1076 (1996)). Similar preference of the two 
domains for terminal valine is expected because the critical residues in the 
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carboxylate binding loop of PDZ3, including the GLGF tetrapeptide, are precisely 
conserved in nNOS-PDZ. While the carboxylate loop of PSD-95 binds most potently 
to peptides with C-terminal valine, other terminal hydrophobic amino acids are 
permitted. Such degeneracy was also found in some nNOS binding peptides, e.g., 
5 NBP-14 (Figures 3B and 3C). Inwardly rectifying potassium channel subunits of 
class 2.0 terminate with S-X-I and these channels also bind to PSD-95. In addition 
the -2 serine of Kir 2.3 serves as a potent substrate for protein kinase A and this 
phosphorylation event regulates binding of the channel to PSD-95 (N. Cohen Neuron, 
17:759-767 (1996)). 

10 Specificity of PDZ3 for T/S at the peptide -2 position is mediated by hydrogen 

bonding of the hydroxyl of the T/S with the N-3 nitrogen of H372 of PDZ3 (D. Doyle 
et al M Cell, 85:1067-1076 (1996)). The corresponding residue in nNOS is Y77. The 
greater electrophilic character of Y compared to H may explain the preference of the 
nNOS PDZ for the acidic amino acid D at peptide position -2. Accordingly, mutation 

15 of Y77D78 of nNOS to H77E78 changes the binding specificity from DXV to TXV. 
Interesting, the Y77 position is not generally conserved in other orphan PDZ domains 
and this single residue may allow for much of the diverse peptide ligand specificity at 
the -2 position. 

These studies emphasize that the nNOS PDZ domain has unique structural 
20 features. The consensus PDZ domain contains 80 amino acids, and PDZ3 of PSD-95 
was functionally active as a 101 amino acid polypeptide (D. Doyle et al., Cell, 
85:1067-1076 (1996)). By contrast, a functional nNOS PDZ domain requires an 
additional 30 amino acids C-terminal to the identified consensus. We wondered 
whether the smaller nNOS constructs, such as nNOS 1-100, were inactive due to a 
25 non-specific problem with polypeptide folding. However, circular dichroism (CD) 
analysis indicated a predicted high degree of secondary structure for nNOS 1-100 
consisting of ~X% of a-helix and ~Y% {3-strand. This is similar to the composition 
of a-helix and p-strand found in PDZ3 structure of PSD-95. Furthermore nNOS 1- 
100 showed thermal stability to 42°C which is comparable to the thermal stability of a 
30 functionally active PDZ domain of FAP. Therefore, we believe that the functional 
nNOS PDZ has a structure somewhat larger than that of other PDZ domains. By 
using our genetic peptide selection strategy, it will be possible to determine whether 
other PDZ domains are also larger than the presently identified consensus. See K. 
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Christopherson et al., Clin. Invest, 100:2424-2429 (1997); and N. Strieker et al., 
Nat Biotechnol, 15:336-342 (1997), the disclosures of which are hereby incorporated 
by reference. 

In addition to interacting with peptide ligands, the PDZ domain of nNOS 
5 associates with other PDZ domains, including the PDZ domain of (1-syntrophin and 
the second PDZ of PSD-95 and PSD-93. Three dimensional structure of a PDZ/PDZ 
heterodimer is not yet available, but our data suggest the PDZ / PDZ binding interface 
overlaps with the peptide recognition sequences. Thus, deletions of nNOS PDZ that 
abolish peptide binding also eliminate binding to (1-syntrophin and PSD-93. 

10 Crystallography of PDZ3 of dig showed that the PDZ domain forms a dimer in which 
the surface of the peptide-binding domain of one PDZ subunit interacts with residues 
in (-strands from the other subunit (J. Cabral et al., Nature, 382:649-652 (1996)). 
This binding topology of PDZ domains may explain why the SXV peptide of the 
NMDA receptor 2B potently blocks nNOS binding to PSD-95 (J. Brenman et al., 
. 15 Cell, 84:757-767 (1996)). Proteins containing the DXV nNOS interacting domain 
may also disrupt interaction of nNOS with PDZ proteins. This may explain the 
paradoxical situation that (1-syntrophin, but not nNOS, is present at the sarcolemma in 
patients with Becker muscular dystrophy (D. Chao et al., Journal of Experimental 
Medicine, 184:609-618 (1996)). Perhaps, in the myofibers of these patients, the 

20 nNOS PDZ is occupied by a protein with a C-terminal D-X-V and is unable to bind to 
OLl-syntrophin. 

The disclosed genetic selection strategy will help identify peptide ligands for 
the 100s of orphan PDZ domains that have been sequenced. After isolating high 
affinity peptides, protein data base analysis may suggest candidate physiological 

25 binding partners. Our search with the terminal DXV consensus for nNOS yielded 
several attractive candidates including melatonin receptor la (U14108) and an 
alternatively spliced form of GluR6 (X661 17). Though nNOS is best activated by 
calcium influx through NMDA receptors (J. Garthwaite et al., Nature, 336:385-388 
(1988)), there is also abundant literature showing that nNOS activity can be regulated 

30 by melatonin (D. Vesely, Mol Cell Biochem., 35:55-58 (1981)) and by non-NMDA 
type glutamate receptors (J. Garthwaite et al., Annu. Rev. Physiol. , 57:683-706 
(1995)). Our data suggest that physical association of nNOS with GluR6 and with 
melatonin receptors may participate in this functional coupling. 
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The invention has been described with reference to preferred embodiments 
thereof. However, it will be appreciated that those skilled in the art, upon 
consideration of this disclosure, may make modifications and improvements within 
the spirit and scope of the invention as set forth in the following claims. 
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What is claimed is: 

1 . A peptide of at least 3 amino acids comprising the sequence D-X-V- 
COOH wherein D=Aspartic acid, X=any amino acid and V=Valine. 

2. An isolated nucleic acid encoding the peptide of claim 1 . 

3. A method for determining the identity of proteins which interact with a 
protein binding domain (orphan protein domain) of a first protein (Protein Interaction 
Network (PIN)) comprising: 

screening a random peptide library comprising transformed host cells, each of 
which contains a plasmid that comprises a lacO binding site and encodes a fusion 
protein comprising a Lac repressor DNA binding protein fused to a peptide, wherein 
each transformed host cell differs from one another with respect to the peptide in said 
fusion protein, said screening comprising lysing the host cells under conditions that 
the fusion protein remains bound to the plasmid at the lacO binding site, contacting 
the fusion proteins of the random peptide library with a protein binding domain 
(orphan protein domain) under conditions conducive to specific peptide-protein 
binding domain (orphan protein domain) binding; 

isolating the plasmid that encodes a peptide that binds to the protein binding 
domain (orphan protein domain); 

sequencing the plasmid to obtain the sequence of the peptide that binds to the 
protein binding domain (orphan protein domain); and 

searching the available nucleic acid and protein sequence databases to identify 
proteins which comprise the sequence of the peptide which binds to the protein 
binding domain (orphan protein domain) 

4. The method of claim 3, further comprising the step of: assembling the 
PINS from different orphan protein domains into an electronic databank that can be 
searched with a the sequence of a protein domain (orphan protein domain) of interest. 

5. A method of treating a neurodegenerative disease, motility disorder or 
muscular dystrophy in a human or animal comprising administering to a patient in 
need thereof an effective amount of the peptide of claim 1 . 

6. The peptide of claim 1 , wherein said peptide comprises at least 5 
amino acids. 

7. The peptide of claim 1, wherein said peptide comprises at least 10 
amino acids. 
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8. The peptide of claim 1, wherein said peptide comprises at least 15 
amino acids. 

9. A peptide ligand detection system comprising: 

a) a random peptide library comprising a recombinant DNA vector 
encoding a DNA binding protein that specifically binds a DNA sequence on the 
vector, the DNA binding protein comprising a covalently linked sequence encoding a 
random peptide sufficient for the vector to encode at least about 10 6 different fusion 
proteins each of which is capable of specifically binding the DNA sequence on the 
vector; and 

b) an orphan protein domain sequence immobilized on a solid support 
capable of specifically binding the random peptide of the DNA binding protein. 

1 0. The peptide ligand detection system of claim 9 further comprising an 
inducer molecule capable of specifically binding the DNA binding protein sufficient 
to release the recombinant DNA vector from the immobilized orphan protein domain 
sequence. 

1 1 . The peptide ligand detection system of claim 9 wherein the DNA 
binding protein comprises a prokaryotic repressor protein sequence and the DNA 
sequence bound by the DNA binding protein is a prokaryotic operator sequence. 

12. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic repressor protein sequence is a lac repressor or a fragment thereof capable 
of specifically binding the DNA sequence on the vector. 

13. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic operator sequence is lac O or a fragment thereof capable of being 
specifically bound by the prokaryotic repressor protein sequence. 

14. The peptide ligand detection system of claim 10 wherein the inducer 
molecule is isopropylthio-p-D-galactoside (IPTG). 

15. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic repressor protein sequence and the random peptide sequence are spaced 
by a peptide linker sequence encoded by nucleic acid sequence comprising -G-G-G-. 

16. A peptide ligand detected by the ligand detection system of claim 1 
having a binding affinity (EC50) for the orphan protein domain of between about 0.5 
to 500 nM. 
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17. A peptide ligand comprising between about 3 and 50 amino acids 
comprising an amino acid sequence consisting of D-X-V-COOH, wherein the peptide 
ligand has a binding affinity (EC50) for an orphan protein domain of between about 
0.5 to 500 nM. 

1 8. The peptide ligand of claim 1 7, wherein the orphan protein domain is a 
PDZ domain. 

19. The peptide ligand of claim 18, wherein the PDZ domain is obtained 
from a protein selected from the group consisting of nitric oxide synthase (nNOS), 
post-synaptic density protein (PSD-95/SAP-90), post-synaptic density protein (PSD- 
93), epithelial tight-junction protein zona occludens-1 (ZOl), N-methyl-D-aspartate 
(NMDA) type glutamate receptor, Shaker-type potassium channel subunit, and 1- 
syntrophin. 

20. A therapeutic composition comprising the peptide ligand of claim 18. 

21. An isolated nucleic acid encoding the peptide ligand of claim 1 8. 

22. A DNA vector comprising the isolated nucleic acid of claim 21. 

23. A method of detecting a peptide ligand capable of specifically binding 
an orphan protein domain of a protein, the method comprising: 

a) lysing transformed cells comprising a random peptide library comprising a 
recombinant DNA vector encoding a DNA binding protein that specifically binds a 
DNA sequence on the vector, the DNA binding protein comprising a covalently 
linked sequence encoding a random peptide sufficient for the vector to encode at least 
10 6 different fusion proteins each of which is capable of specifically binding the DNA 
sequence on the vector, wherein the lysing is under conditions such that the DNA 
binding protein comprising the random peptide remains bound to the recombinant 
DNA vector, 

b) contacting the fusion proteins of the random peptide library to an 
immobilized orphan protein domain under conditions conducive to specific peptide- 
orphan protein domain binding; and 

c) isolating a recombinant DNA vector encoding a fusion protein that 
specifically binds to the orphan protein domain. 

24. The method of claim 23 further comprising the steps of transforming a 
host cell with the recombinant DNA vector obtained in step c), repeating steps a), b), 
and c) with the host cell, and isolating a selected recombinant DNA vector. 
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25. The method of claim 24 further comprising determining the amino acid 
sequence of the random peptide encoded by the selected recombinant DNA vector. 

26. The method of claim 25 further comprising searching a protein 
sequence database to identify an orphan protein domain in the database comprising 
the random 

peptide. 

27. The method of claim 26 further comprising assembling a protein 
interaction network (PIN) sufficient to correlate a plurality of random peptide 
sequences to the orphan protein domain. 

28. The method of claim 27 further comprising assembling a super protein 
interaction network (SPINS) comprising a plurality of protein interaction networks 
(PINs) sufficient to serve as an electronic extension database for the protein sequence 
database. 

29. The method of claim 26 wherein the orphan protein domain in the 
database is any one of the orphan protein domains (protein modules) shown in Figure 
7. 

30. A method of detecting a peptide ligand capable of specifically binding 
an orphan protein domain of interest, the method comprising searching a super protein 
interaction network (SPINS) with an amino acid sequence comprising an orphan 
protein domain of interest, and identifying the peptide ligand capable of specifically 
binding the orphan protein domain of interest. 

31 . The method of claim 30, wherein the peptide ligand is obtained from a 
random peptide library. 
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CLONE NO. 

Library 

PD-205 
PD-210 
PD-211 
PD-212 
PD-215 
PD-303 
PD-307 
PD-312 
PD-314 
PD-315 
PD-325 



SEQUENCE 

GGGXXXXXXXXXXXXXXX* 

GGGMFVGDQVDLRLETSV* 
GGGMA TSRPSGARR TTSV* 
GGGMSGWPHDWLGRETTV* 
GGGMFVGDQVDLRLETSV* 
GGGILIVRNLETSV* 
GGGRSLIGAVEKRQETSV* 
GGGQETLRRLSVGPETSV* 
GGGHRRSAR YLESSV* 
GGGREASNKVRLRKES TV* 
GGGGPESLLWKVRRETSL* 
GGGR I ELHGVLKGCETAV * 



FIG. 2B 



SUBSTITUTE SHEET (RULE 26) 



WO 98/23781 



PCT/DS97/21861 



4/50 



N 

Q 

CL 
i 

CO 

o 



CD 

o: 
uj 

00 < 
< CO 
Z CO 



to 

CM 



1^ 

o 
o 



— i r 

in o 
h- in in 

o § d 

D 
O 



3g 



51 



5S 



3s 



J3i 



Si 



3sa 



in 

CM 

d 



CO 



Or CO 
—CD 

£ CO 

-5? 



CL ,_ 

•CD co 



■DO cO' 



■I? 

-I? 

•CQ co 



"r in 
CO CO 
Z ' 



•CQ CO 



CD co 



CQ CM 



-mco 

CL (vi 

2>S 



CQ CM 



•CQCM 



.cqcm 



Q. ,_ 

•mcM 



■CQ 



mcM 

■i? 

■I 9 
■I? 

£ CO 
•CQ t- 
2 ' 



-CQ 



■CQ ™ 



— CQ 
Z 

Si 5 



CD 



a. ^_ 

•00 4- 



Q. 
-CD 



■00 <? 



T£ — 
-00 ^ 

z 

-00 

z 



Z ' 

-co op 



CD CD 



CL _ 
■00 c? 

z 



a. 

■co v 



CL 

CO cn 



LU 

z 
o 

-I 

o 



< 

CO 



SUBSTITUTE SHEET (RULE 26) 



WO 98/23781 



PCT/US97/21861 



5/50 

CLONE NO. SEQUENCE 

Library gggXXXXXXXXXXXXXXX* 

NBP-4 GGGGT P QKAVHRDWGVS V * 

NBP-5 GGGI RAGGD P V * 

NBP-7 -. . GGGDPV* 

NBP-8 . GGGDARTKIWNRAADLI * 

NBP-9 GGGAQGRW PQF CVYPD AV * 

NBP-10 GGGVHVFGDSV* 

NBP-11 GGGVLGDLV* 

NBP-12 GGGAMEVTLLSHQPGDPV* 

NBP-14 GGGDAI* 

NBP - 1 5 GGGWAGYGRGMAVS GDMV* 

NBP-17 GGGFPFFMGTMGEYGIQV* 

NBP- 18 GGGLGKDYPSAPDNGDLV* 

NBP-24 GGGI YGMMR I GTGLVD VL * 

NBP- 2 7 GGGAGQDKQAGQHWGDLV* 

NBP-28 GGGGVDWV* 

NBP- 3 2 GGGDAV* 

NBP-33 GGGRWDWV* 

NBP- 34 GGGKGH I AI TS DGVGDLL * 

NB P - 3 5 GGGNYDRVGLLRGP VDFL * 

NBP-36 GGGKRPDGVLFQRPGDLV* 

NBP- 3 7 GGGDAV* 

NBP-4 1 GGGDPV* 

NBP-4 2 GGGGDAV* 

NBP -44 GGGGLARLNLSSYYGDAV* 

NBP- 45 GGGVDWV* 

NBP-4 7 GGGRVIGSPNPSRSADIV* 

NBP-4 8 GGGDWV* 

NBP-49 GGGS FMNBPVAGTAGDSV* 

NBP-52 GGGSRGDMV* 

NBP- 53 GGGDWV* 

NBP- 54 GGGDGMLLRR P QLRW I FC * 

NBP - 5 5 . GGGKRDETGFNMWGNAV* 

NBP - 5 6 GGGWQGDPV* 

NBP - 5 7 GGGALGDPV* 

NBP- 59 GGGDPV* 

NBP- 60 GGGGDLV* 

NBP- 6 1 . . GGGESGSGVRTWGVPV* 

NBP- 62 . . . GGGRVQLVRGGVDCV* 

NBP- 64 GGGDAV* 

NBP - 6 5 . GGGWRWKSVMRWPDPV* 

NBP- 6 6 GGGDLV* 

NBP-67 . . GGGSKSCGRVILGDIV* 

NBP-68 GGGVDWV* 

NBP- 6 9 GGGI I QGQARGTRWGEMV * 

NBP- 70 GGGDAV* 

NB P - 7 1 GGGGGWPELNPNLLGVP I * 

NBP- 72 GGGRCMLNLVTGRWADTV* 

NBP- 73 GGGGMGQTLE ELTTGDWV * 

NBP- 74 GGGDRGWAVGWGLRGVP V * 

NBP - 7 6 GGGGPARYGDSV* 

NBP- 77 GGGDLV* 

NBP - 7 8 GGGFS S LVLGAGDLGVAP * 

NBP - 7 9 . GGGMQWWAQRDLAGDCV* 

NBP- 8 1 GGGKDGGRQGANFFGDAV * 

NBP- 82 GGGTWGRAV* 
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CLONE NO. SEQUENCE 

NBP-83 GGGLKS TGS E VNS LGD W * 

NBP-84 GGGS EATAVWTS KWSDLV* 

NBP - 8 5 GGGPVSSVRYSGVAGDQV* 

NBP-86 GGGLWSDAV* 

NBP- 87 GGGRVTGRS S YLGMGD I V* 

NBP- 88 GGGDMV* 

NBP- 8 9 GGGKF S VRHTL VS AGD P V * 

NBP - 9 1 GGGARGQLPATRCKAFLC * 

NBP- 92 GGGYEEGVAV* 

NBP- 93 GGGDRV* 

NBP- 94 GGGDLV* 

NBP - 9 5 GGGVRGALTRGMTPGDPV* 

NBP- 96 GGGDLV* 

NBP- 102 . . . . GGGVAGVGKYGDLV* 

NBP- 103 GGGDLV* 

NBP-107 GGGDVI* 

NBP - 1 0 8 GGGKMRVGVDAV* 

NBP -111 GGGDPV* 

NBP-112 GGGRDSERLMGIPV* 

NBP- 113 GGGDQV* 

NBP - 114 GGGRWSEGDGV* 

NBP- 117 GGGLGRGS VRPGRRPD IV* 

NBP- 11 8 GGGDW* 

NBP -11 9 GGGI KRLD I YMRN I GDLV* 

NBP - 12 2 GGGS ATAWNGD P V* 

NBP - 12 3 . . GGGLDRLRNRVHGDAV* 

NBP - 1 2 4 GGGREVSVCHRPDAGDAV* 

NBP - 12 5 GGGSRVPRNTS I FWGNAV* 

NBP - 1 2 6 GGGD CGNVTHAI LWGDAV* 

NBP - 128 GGGKALGA I YVMGG VDAV * 

NBP-129 GGGWGSPV* 

NBP - 1 3 1 GGGKGSPSLVGPVWADAV* 

NBP- 13 3 GGGILNPVPRNLSEGDYV* 

NBP- 13 6 GGGDQV* 

NBP- 13 7 GGGGERLNRSATAGADLV* 

NBP- 13 8 GGGEGGRNPDI V* 

NBP- 140 GGGNQRYWNPFIWGQSV* 

NBP-142 GGGDS INLSWPVAV* 

NBP - 14 3 GGGCMLQVRHI YGPCDAV* 

NBP - 161 GGGVIGKSCYGDAV* 
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1. Endoplasmic reticulum targeting sequence 

2. Microbodies C-terminal targeting signal 

3. Gram-positive cocci surface proteins anchoring hexapeptide 

4. Bipartite nuclear targeting sequence 

5. Cell attachment sequence 

6 . ATP / GTP-binding site motif A (P-loop) 

7. Cyclic nucleotide-binding domain signatures 

8. EF-hand calcium-binding domain 

9. Actinin-type actin-binding domain signatures 

10. Anaphylatoxin domain signature and profile 

1 1 . Apple domain 

12. Band 4. 1 family domain signatures 

13. Clq domain signature 

14. C-terminal cystine knot-signature and profile 

15. CUB domain profile 

16. Death domain profile 

17. EGF-like domain signatures 

18. Calcium-binding EGF-like domain signature 

19. Forkhead-associated (FHA) domain profile 

20. Fibrinogen beta and gamma chains C-terminal domain signature 

21. Type II fibronectin collagen-binding domain 

22. Hemopexin domain signature 

23. Kringle domain signture 

24. LDL-receptor class A(LDL RA) domain signature 

25. C-type lectin domain signature 

26. Osteonectin domain signatures 

27. Somatomedin B domain signature 

28. Thyroglobulin type-1 repeat signature 

29. P-type ("Trefoil") domain signature 

30. Cellulose-binding domain, bacterial type 

31. Cellulose-binding domain, fungal type 

32. Chitin recognition or binding domain signature 

33. Barwin domain signatures 

34. WAP-type *four-disulfide core* domain signature 

35. Phorbol esters /diacylglycerol binding domain 

36. C2 domain signature and profile 

37. CAP-Gly domain signature 

38. Ly-6 /u-PAR domain signature 

39. MAM domain signature 

40. PH domain profile 

41. Phosphotyrosine interaction domain (PLD) profile 

42. Src homology 2 (SH2) domain profile 

43. Src homology 3 (SH3) domain profile 

44. VWFC domain signature 

45. WW/rsp5/WWP domain signature and profile 

46. ZP domain signature FIG. 7 

47. S-layer homology domain signature 



SUBSTITUTE SHEET (RULE 26) 



WO 98/23781 



PCT/US97/21861 



17/50 



Results: PDZ scan (D-X-V) vs. non-redundant protein database 



Etsan 

>Si!56V07!pir;iA3 i 159 115 K fusion protein - human gi!3 87034 (M21610) RNA polymerase H [Homo 
sapiens] (Match DLY) 

>gili8t745 (M36472) MKC class H cell surface orctem [Ecmo saoieas] (Match DTV) 
>gii5^3732Is?iP36639!SODP_KUNUN 7 f 3-DEYDRC-3OX0GUANINE TRIPHOSPHATASE (8-OXO- 
DGTPASc;. gii5 4274S !pirilA-8336 3-oxc-7,8-dihydror.ianosine crichephatase • human giI452589 
(D\65 81) 3-oxo-dOTPase [Hcmo sacieas] gill 405350 (D38594) 3-oxc-dGTPase [Homo saoiens] (Match 
DTV ) 

>gill77776 (MS63-&1) sercccnin recectcr [Kcmo saciens] (Match DGV; 

>gill 149^7ls ? i?l£27?i3C^Lr7J>LOf 3 ETA GAI^.CTOSIDASE-RELATED PROTEIN 

PRECURSOR. ziU 05^3 ^'cirilE 32633 beia-gaiac:cs;dase-re:ated protein - human gil 179421 (M2750S) 

be:a-gaiac:csidase related rrctein precursor [Kcmo saciens] (Match DEV) 

>gilI87273 (MS4;*0) lysyl oxidase [Homo saoiens] (Match DLV) 

>gi!553572 (M338S7) NG-IC class E ELA-DQ^alrha-1 [Homo sacienst (Match DTV)' 

>gill 1494Glspi? 1 52783 GAL_EUNL\N 3ETA-GALACTOSIDASE PRECURSOR 

(LACTASE). giiS6?3a. : piri!A32611 be:a-gaiac:osidase (EC 3-2.1.23) precursor - human gil 179401 

(M27507) beta-D-galactcsidase precursor (EC 32.123) [Ecmo sapiens] gil 179423 (M34423) beta- 

galactosidase precurscr (EC 32.123) [Homo sapiens] (Match DEV) 

>gi!179419 (M225S0) be a- galactosidase precursor (EC 3 2. 123) [Homo sapiens] (Match DHV) 
>gil!8 1759 (M63 195) DR3 1 transplantation andgea [Hcmo sauiens] (Match DTV) 
>gil 124462IspiP 1713 IIES"R1.KU\LaN- INTERFERON- ALPHA/B ETA RECEPTOR ALPHA CHAIN 
PRECURSOR (IFN- ALPHA- RZQ. gill0679OlpirilA32694 incerferon alpha receptor precursor - 
human giGC6914 (J03I71) interferon- aloha receptor precursor [Kcmo 

sapiens] gil 15673 85ig:dlFEDIe25 1623 (A32391) chimeric IFNalpha/beta-receptor [Homo sapiens] (Match 
DFV) 

>gi!30972 (2142C6) Ig heavy chain variable region (VDJ) [Homo sapiens] (Match DMV) 

>gi!32672 (X60459) Human IFNAR zene for interferon aloha/beta receotcr (Homo saoiens] (Match DFV) 

>gill25472!spiPl0721!X:<I7.EUMAN MAST/STEM CELL GROWTH FACTOR RECEPTOR 

PRECURSOR (SCFR) (PROTO-ONCOGENE TYROS INE • PROTEIN KINASE KIT) (C-KTT) 

(CD1 17). gil66311lpir:rTVHUKT protein-tyrosine kinase (EC 2.7.1.112) kic precursor - human ei!34085 

(X06182) protein pi45-ckit (AA 1 - 976) [Homo sapiens] giiS25636 (X69301) mast/stem ceil growth 

factor receptor [Homo sapiens] (Match DD V) 

>gi!34992 (X17161) Beta l-sucunit of Na(+)J«+)-AT?ase [Homo sapiens] (Match DRV) 
>giI631336lpir«S42563 POU domain protein - human gi!437809 (221963) POU domain protein [Homo 
sapiens] (Match DW) 

>gi!4373ll (221964) POU domain protein [Homo saoiens] (Match DW) 
>gil437S!3 (221965) POU domain protein (Homo saoiens] (Match DW) 
>gi!l 17098lsptP2G674:COX.A._KUMA.V CYTOCHROME C OXIDASE POLYPEPTIDE VA 
PRECURSOR. gi!66276ipirI!OTHU5A cytochrome-c oxidase (EC 1.9.3.1) chain Va precursor - 
human giI695360 (M22760) cytochrome c oxidase subunit Va [Homo saoiens] (Match DECV) 
••Not human*->gU535709lsclQ04544iPOLN.SOUV3 NON-STRUCTURAL POLYPROTEIN 
(CONTAINS: RN A-D IRECTED RNA POLYMERASE , THIOL PROTEASE . HELICASE (2C LIKE 
PROTEIN)). gil476733ipiri!A3749l orfl putative helicase/polymerase polyprotein - Southampton 
virus giI444364iprf.U9C64l03 rheumatoid factor VH [Homo sapiens] (Match DGV) 



Figure 8A 
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>gill346544lsplP48039IMLlA_HUMAN MELATONIN RECEPTOR TYPE 1A 
(MEL-1A-R). gil602130 (U 14108) Mel- la melatonin receptor [Homo sapiens] 
(Match DSV) 

>gil726255 (U22228) aggrecan [Homo sapiens] (Match DFV) 

>gil793763 (D26512) MT-MMP [Homo sapiens] (Match DKV) 

>gil804994 (X83535) MT-MMP [Homo sapiens] (Match DKV) 

>gil963054 (Z48481) membrane-type matrix metalloproteinase 1 [Homo sapiens] 

gill 127837 (U41078) membrane-type matrix metalloproteinase- 1 [Homo sapiens] 

(Match DKV) 

>gil976297 (L37839) This CDS feature is included to show the translation of the 
corresponding V_segment. Presently translation qualifiers on V_segment features 
are illegal. [Homo sapiens] (Match DAV) 

>gill 24746 HgnllPIDIe200676 (A26595) interferon beta receptor [Homo sapiens] 
(Match DFV) 

>gill262584 (D90161) leader sequence, L" [Homo sapiens] (Match DPV) 
>gill495995lgnllPIDIel96537 (X90925) MT-MMP protein [Homo sapiens] (Match 
DKV) 



Mouse 

>gil244607lbbsl79586 cleaved prolactin- 1, clPRL-l=fragment A [rats, Peptide 
Partial, 20 aa] (Match DRV) 

>gil497021 (U05699) cytochrome c oxidase subunit Va [Mus spretus] (Match 
DKV) 

>gil505029 (D 14849) meiosis-specific nuclear structural protein 1 [Mus musculus] 
(Match DGV) 

>gil531881 (U12877) vascular cell adhesion molecule-1 [Mus musculus] (Match 
DTV) 

>gill91913 (Ml 1895) A-l alpha-amylase [Mus musculus] (Match DKV) 

>gill91919 (Ml 1896) B-l alpha-amylase [Mus musculus] (Match DKV) 

>gill92098 (M18187) B144 protein A [Mus musculus] (Match DYV) 

>gil 1 96056 (M34984) Ig H-chain [Mus musculus] (Match DTV) 

>gil554244 (K03547) myb protein [Mus musculus] (Match DSV) 

>gil 1 363 1 94lpirll A53202 MAMA protein precursor - mouse gi!297033 (X67809) 

mama gene product [Mus musculus] (Match DMV) 

>gil423447lpirllS35792 glutamate receptor GluR6C - mouse gil3 12494 (X66117) 
glutamate receptor subunit GluR6C [Mus musculus] (Match DTV) 
>gill 17099lsplP12787ICOXA_MOUSE CYTOCHROME C OXIDASE 
POLYPEPTIDE VA PRECURSOR. gil90420lpirllS05495 cytochrome-c oxidase 
(EC 1.9.3.1) chain Va precursor - mouse gil50527 (X15963) cytochrome c oxidase 
subunit Va preprotein [Mus musculus] (Match DKV) 
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>gil805000 (X83536) MT-MMP [Mus musculus] (Match DKV) 

>gil939951 (X73037) partial paired box; pid:e74985 [Mus musculus] (Match DGV) 

>gil 11 84877 (U46562) MHC class II transactivator CIITA [Mus musculus] (Match 

DMV) 

>gill215666 (U17267) T cell receptor-Zeta [Mus musculus] (Match DEV) 
>gill326151 (U52222) Mel- 1 a melatonin receptor [Mus musculus] (Match DSV) 



Rat 

>gil666942 (M22615) cholesterol side-chain cleavage enzyme [Rattus norvegicus] 
(Match DTV) 

>gi 1 1 1 2437 Ipirll S 206 1 2 triacylglycerol lipase (EC 3.1.1.3) - rat gil56600 (X61925) 
triacylglycerol lipase [Rattus norvegicus] (Match DTV) 
>gill 17262lsplP14137ICPMl_RAT CYTOCHROME P450 XIA1, 
MITOCHONDRIAL PRECURSOR (P450(SCC)) (CHOLESTEROL SIDE- 
CHAIN CLEAVAGE ENZYME) (CHOLESTEROL DESMOLASE). 
gil92074ipirl I A34 1 64 cholesterol monooxygenase (side-chain-cleaving) (EC 
1.14.15.6) cytochrome P450 11A1 - rat gil203561 (M63133) cytochrome P-450-scc 
[Rattus norvegicus] gil203639 (J05156) cholesterol side-chain cleavage enzyme 
precursor (EC 1.14.15.6) [Rattus norvegicus] (Match DTV) 
>gil204101 (K01336) beta-fibrinogen [Rattus norvegicus] (Match DKV) 
>gil206148 (Ml 6960) calcium-calmodulin-dependent protein kinase II [Rattus 
norvegicus] (Match DGV) 

>gill 17 lOOIspIPl 1240ICOXA_RAT CYTOCHROME C OXIDASE 

POLYPEPTIDE VA PRECURSOR. gil92182lpirllS04592 cytochrome-c oxidase 

(EC 1.9.3.1) chain Va precursor - rat gil55971 (XI 5030) cytochrome c oxidase 

subunit Va preprotein [Rattus norvegicus] (Match DKV) 

>gil682650 (LI 9 118) complement receptor type 1 [Rattus norvegicus] (Match 

DQV) 

>gil805013 (X83537) MT-MMP [Rattus norvegicus] (Match DKV) 
>gill001927 (X91785) membrane-type metalloproteinase [Rattus norvegicus] 
(Match DKV) 
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>gill334296lgnllPIDIel0391 (X03914) interleukin-3 (aa 102-115) [Rattus 
norvegicus] (Match DSV) 

D. melanogaster 

>gil46 1 852lsplP35220ICTNA_DROME ALPHA-CATENIN. gil422436lpirllA40694 
cadherin-associated protein D alpha-catenin - fruit fly (Drosophila melanogaster) 
gil285752 (D 13964) alpha-catenin [Drosophila melanogaster] (Match DAV) 
>gil259790lbbsll 17942 (S48157) DNA polymerase-primase 180 kda subunit 
[Drosophila melanogaster, Peptide, 1490 aa] (Match DW) 
>gil546972lbbsl 148992 (S70576) putative receptor tyrosine kinase=Dret 
[Drosophila melanogaster, Canton-S, Peptide Partial, 817 aa] (Match DAV) 
>gil321036lpirllPS0443 potassium channel protein Slo G3 - fruit fly (Drosophila 
melanogaster) (fragment) (Match DLV) 



C. elegans 

>gil465792lsplP34428IYL37_CAEEL HYPOTHETICAL 45.5 KD PROTEIN 
F44B9.7 IN CHROMOSOME III. gil630626lpirllS44810 F44B9.7 protein - 
Caenorhabditis elegans gil388589 (L23648) putative [Caenorhabditis elegans] 
(Match DQV) 

>gil466054lsplP34680IYO42_CAEEL HYPOTHETICAL 32.7 KD PROTEIN 
ZK757.2 IN CHROMOSOME III. gil482218lpirllS41012 hypothetical protein 
ZK757.2 - Caenorhabditis elegans gil438368 (Z29121) ZK757.2 [Caenorhabditis 
elegans] (Match DW) 

>gil458953 (U00031) similar to phosphatidylserine decarboxylase [Caenorhabditis 
elegans] (Match DGV) 

>gil722365 (U22833) W02B3.5 [Caenorhabditis elegans] (Match DFV) 
>gil746503 (U23516) B0416.2 gene product [Caenorhabditis elegans] (Match 
DDV) 

>gil 10 19950 (U37429) similar to protein kinase C [Caenorhabditis elegans] (Match 
DSV) 

>gill055055 (U39850) coded for by C. elegans cDNA yk37gl.5; coded for by C. 

elegans cDNA yk5c9.5; coded for by C. elegans cDNA ykla9.5; alternatively 

spliced form of F52C9.8b [Caenorhabditis elegans] (Match DNV) 

>gi!10551 10 (U39995) coded for by C. elegans cDNA yk25b9.3; coded for by C. 

elegans cDNA yk25b9.5 [Caenorhabditis elegans] (Match DRV) 

>gil 1086851 (U41270) Similar to transmembrane domain of family 1 of G-protein 

coupled receptors. [Caenorhabditis elegans] (Match DEV) 

>gill082139 (Z68118) R01E6.2 [Caenorhabditis elegans] (Match DFV) 
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>gilll00868lgnllPIDIe212230 (Z68135) ZK1073.2 [Caenorhabditis elegans] 
(Match DNV) 

>gill352438lsplQ10055IIF4N_SCHPO EUKARYOTIC INITIATION FACTOR 
4A-LIKE PROTEIN C1F5.10. gill 103737 (Z68136) unknown 
[Schizosaccharomyces pombe] (Match DMV) 

>gill 1 18060 (U41552) coded for by C. elegans cDNA yk3dl 1.5; coded for by C. 

elegans cDNA yk5f4.5 [Caenorhabditis elegans] (Match DIV) 

>gil 1125770 (U42838) T08G2.2 gene product [Caenorhabditis elegans] (Match 

DDV) 

>gill 185450 (U36581) cyclophilin isoform 9 [Caenorhabditis elegans] (Match 
DLV) 

>gill229053lgnllPIDIe229193 (Z70207) F15A2.6 [Caenorhabditis elegans] (Match 
DKV) 

>gill255324 (U51999) C43H6.7 gene product [Caenorhabditis elegans] (Match 
DIV) 

>gill255397 (U53150) F20A1.2 gene product [Caenorhabditis elegans] (Match 
DSV) 

>gill313955lgnllPIDIe241752 (Z73098) T21C9.13 [Caenorhabditis elegans] (Match 
DIV) 

>gil 16277 17lgnllPIDIe276022 (Z81053) E02A10.4 [Caenorhabditis elegans] 
(Match DIV) 

>gill627903lgnllPIDIe275743 (Z81076) F35C5.f [Caenorhabditis elegans] (Match 
DGV) 

>gil 1658357 (U64849) K04A8.8 gene product [Caenorhabditis elegans] (Match 
DKV) 



S. cerevisiae 

>gil728821lsplP39010IAKRl_YEAST ANKYRIN REPEAT-CONTAINING 
PROTEIN AKR1. gil626094lpirllS48521 AKR1 protein - yeast (Saccharomyces 
cerevisiae) gil466522 (L31407) ankyrin repeat-containing protein [Saccharomyces 
cerevisiae] gill 230637 (U51030) Ankyrin repeat- 
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containing protein (Swiss Prot. accession number P39010). [Saccharomyces 
cerevisiae] gill586336lprfll2203403A ankyrin repeat-containing protein 
[Saccharomyces cerevisiae] (Match DMV) 

>gil731840lsplP40500IYII9_YEAST HYPOTHETICAL 23.9 KD PROTEIN IN 
SGA1-THS1 INTERGENIC REGION. giil077785lpirllS49791 hypothetical protein 
YI9910.07 - yeast (Saccharomyces cerevisiae) gil577125 (Z46728) YI9910.07, 
unknown orf, len: 205, CAI: 0.1 1 [Saccharomyces cerevisiae] gil763257 (Z47047) 
unknown [Saccharomyces cerevisiae] (Match DEV) 

>gill40345lsplP25554IYCB0_YEAST HYPOTHETICAL 16.6 KD PROTEIN IN 
GBP2-PEL1 INTERGENIC REGION. gil83138lpirllS 19337 hypothetical protein 
YCLOlOc - yeast (Saccharomyces cerevisiae) gil5358lgnllPIDIe264452 (X59720) 
YCLOlOc, len: 146 [Saccharomyces cerevisiae] (Match DTV) 
>gil731426lsplP39941IYEI0_YEAST HYPOTHETICAL 56.5 KD PROTEIN IN 
HXT8 5'REGION. gill 0776 19lpirllS505 19 hypothetical protein YEL070w - yeast 
(Saccharomyces cerevisiae) gil603248 (U18795) Yel070p [Saccharomyces 
cerevisiae] gill302610lgnllPIDIe239852 (Z71688) ORF YNR073c [Saccharomyces 
cerevisiae] (Match DQV) 

>gill 174566lsplP41896IT2FB_YEAST TRANSCRIPTION INITIATION FACTOR 
IIF, BETA SUBUNIT (TFIIF-BETA) (TFIIF MEDIUM SUBUNIT) 
(TRANSCRIPTION FACTOR G 54 KD SUBUNIT). gill078424lpirllB55482 
transcription initiation factor IIF 54K chain - yeast (Saccharomyces cerevisiae) 
gil639703 (U13016) transcription initiation factor TFIIF middle subunit 
[Saccharomyces cerevisiae] (Match DW) 

>gil825501 (L42348)HOLl [Saccharomyces cerevisiae] (Match DGV) 
>gil258767lbbsll 17066 cytochrome c oxidase Via subunit homolog 
[Saccharomyces cerevisiae, JHRY1-2 alpha, Peptide Partial, 19 aa, segment 1 of 5] 
(Match DKV) 

>gil847740 (U 19781) beta-fructofuranosidase 2 precursor [Saccharomyces 
cerevisiae] (Match DTV) 

>gil914979 (U32445) P8283.8 gene product [Saccharomyces cerevisiae] (Match 
DRV) 

>gill353041lsplP46984IYJS4_YEAST HYPOTHETICAL 13.6 KD PROTEIN IN 
SWE1-ATP12 INTERGENIC REGION. gill077849lpirllS56967 hypothetical 
protein YJL184w - yeast (Saccharomyces cerevisiae) gill 008389 (Z49459) ORF 
YJL184w; pid:e201216 [Saccharomyces cerevisiae] (Match DAV) 
>gill352875lsplP47104IYJ03_YEAST HYPOTHETICAL 154.9 KD PROTEIN IN 
MER2-PET191 INTERGENIC REGION. gill077878lpirllS57052 hypothetical 
protein YJR033c - yeast (Saccharomyces cerevisiae) gill015679 (Z49533) ORF 
YJR033c; pid:e203690 [Saccharomyces cerevisiae] (Match DFV) 
>gilll29167 (X87297) J 1590 gene product [Saccharomyces cerevisiae] (Match 
DFV) 

>gil 1134890 (Z68290) Akrlp [Saccharomyces cerevisiae] gill 226040 (Z70202) 
Akrlp [Saccharomyces cerevisiae] (Match DMV) 
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>gill302574lgnllPIDIe239841 (Z71670) ORF YNR055c [Saccharomyces 
cerevisiae] (Match DGV) 

>gill322879lgnllPIDIe243887 (Z72748) ORF YGL226w [Saccharomyces 
cerevisiae] (Match DLV) 

>gill322961lgnllPIDIe243366 (Z72790) ORF YGR005c [Saccharomyces 
cerevisiae] (Match DVV) 

>gill323286lgnllPIDIe243550 (Z72948) ORF YGR163w [Saccharomyces 
cerevisiae] (Match DDV) 

>gill420794lgnl!PIDIe252191 (Z75275) ORF YOR367w [Saccharomyces 
cerevisiae] (Match DIV) 



Other 



>gil401194lsplP31015ITNA2_SYMTH TRYPTOPHANASE 2 (L-TRYPTOPHAN 
INDOLE-LYASE 2). gil477858lpirllB49022 tryptophanase (EC 4.1.99.1) Tna2 - 
Symbiobacterium thermophilum gil216979 (D10013) tryptophanase 
[Symbiobacterium thermophilum] (Match DLV) 

>gill55612 (L09651) phosphoglycerate mutase [Zymomonas mobilis] (Match 
DLV) 

>gill361344lpirllD36891 transfer complex protein TrsC - Staphylococcus aureus 
gil310610 (LI 1998) putative [Staphylococcus aureus] gil405562 (LI 9570) putative 
[Plasmid pSK41] gil739958lprfll2004267D membrane protein traC [Staphylococcus 
sp.] (Match DDV) 

>gil6257 10lpirllC49695 4-methyl-5-(beta-hydroxyethyl)thiazole monophosphate 
synthesis protein ThiF - Escherichia coli gil414234 (M88701) thiF [Escherichia 
coli] (Match DPV) 

>gil97777lpirllA38729 pyruvate decarboxylase (EC 4.1.1.1) - Sarcina ventriculi 
(fragment) gil249565lbbsl 103674 pyruvate decarboxylase {EC 4.1.1.1 } [Sarcina 
ventriculi, strain JK, Peptide Partial, 36 aa] (Match DYV) 
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>gil298240lbbsl 125733 DNA polymerase homolog [bacterium-like organism, citrus 
greening disease-associated, Peptide, 207 aa] (Match DLV) 
>gil477 1 73lpirll A48368 N5,N 1 0-methenyltetrahydromethanopterin cyclohydrolase 
- Archaeoglobus fulgidus (fragment) gil29988 1 Ibbsl 130469 N5,N10- 
methenyltetrahydromethanopterin cyclohydrolase {N-terminal} [Archaeoglobus 
fulgidus, VC-19. DSM 4304, Peptide Partial, 38 aa] (Match DGV) 
>gil406020 (U01764) unknown [Mycoplasma genitalium] (Match DSV) 
>gil414513 (U02113) homology to ribosomal protein LI Z11839 [Mycoplasma 
genitalium] (Match DW) 

>gil396331 (U00006) similar to E. coli ChlN [Escherichia coli] (Match DPV) 
>gil543897lsplP35804IBLIP_STRCL BETA-LACTAMASE INHIBITORY 
PROTEIN PRECURSOR (BLIP). gil98890lpirllA36710 beta-Lactamase inhibitory 
protein precursor - Streptomyces clavuligerus gill 53 192 (M34538) beta-lactamase 
inhibitory protein precursor [Streptomyces clavuligerus] (Match DLV) 
>gil538757lpirllA53488 heat shock cognate protein 66 - Escherichia coli gil454766 
(U05338) Hsc66 [Escherichia coli] (Match DEV> 

>gil461079lbbsl 142342 GroEL homolog {N-terminal} [Francisella tularensis, LVS, 
Peptide Partial, 18 aa] (Match DGV) 

>gil547685lsplP36541IHSCA_ECOLI HEAT SHOCK PROTEIN HSCA (HSC66). 
gill073308lpirllB36958 66K hsp70 homolog HscA - Escherichia coli gil402675 
(U01827) Hsp70 [Escherichia coli] (Match DEV) 
>gil 1 29002lsplP0706 1 IN YLB_FLASP 6-AMINOHEXANOATE-DIMER 
HYDROLASE (NYLON OLIGOMERS DEGRADING ENZYME EII). 
gil77553lpirllA29516 6-aminohexanoate-dimer hydrolase (EC 3.5.1.46) EII - 
Flavobacterium sp. KI72 plasmid pOAD2 gil43418 (X00046) EII enzyme 
[Flavobacterium sp.] gil488340 (D26094) 6-aminohexanoate-dimer hydrolase 
[Flavobacterium sp.] gil223803iprfll0912258A enzyme RSIIA,nylon degrading 
[Flavobacterium sp.] (Match DAV) 

>gil488342 (D26094) 6-aminohexanoate-dimer hydrolase [Flavobacterium sp.] 
(Match DAV) 

>gil507769 (U09675) RNA polymerase beta subunit [Liberobacter africanum] 
(Match DGV) 

>gilll891 llsplP10740IDPSD_ECOLI PHOSPHATIDYLSERINE 
DECARBOXYLASE PROENZYME. gil78759lpir!IA29234 phosphatidylserine 
decarboxylase (EC 4.1.1.65) precursor - Escherichia coli gil537004 (U 14003) 
phosphatidylserine decarboxylase [Escherichia coli] gil551827 (J03916) 
phosphatidylserine decarboxylase [Escherichia coli] (Match DQV) 
>gill361237lpirllS56466 phosphotransferase system trehalose permease - 
Escherichia coli gil537082 (U 14003) phosphotransferase system trehalose 
permease [Escherichia coli] (Match DIV) 

>gil479220lpirllS32798 merR protein - Xanthomonas sp. transposon Tn5053 
gil480554lpirllS37035 regulatory protein merR - Alcaligenes sp. 
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gil480563lpirllS37044 regulatory protein merR - Pseudomonas fluorescens 
gill086170lpirllS51756 regulatory protein merR - Pseudomonas testosteroni 
gill54910 (L03729) putative [Transposon Tn5053] gil388554 (L20693) mer operon 
regulator [Alcaligenes sp.] gil393198 (L20694) mer operon regulator [Plasmid 
pMER05] gil397588 (Z23094) merR regulatory protein (repressor /inducer) 
[Alcaligenes sp.] gil397618 (Z23095) merR regulatory protein (repressor /inducer) 
[Pseudomonas fluorescens] gil483767 (X73112) mercury resistance DNA-binding 
protein [Pseudomonas fluorescens] gil607170 (Z33481) regulatory protein 
[Comamonas testosteroni] gil7 10575 (L40585) merR regulatory protein (repressor 
/inducer) [Transposon Tn5053] (Match DAV) 

>gil 142082 (L02520) ribulose 1,5-bisphosphate carboxylase/oxygenase large 
subunit [Anabaena sp.] gill 42086 (L02521) ribulose 1,5-bisphosphate 
carboxylase/oxygenase large subunit [Anabaena sp.] gill 42088 (L02522) ribulose 
1,5-bisphosphate carboxylase/oxygenase large subunit [Anabaena sp.] gill42105 
(JO 1540) ribulose- 1,5-bisphosphate carboxylase large subunit (rbcL) [Anabaena 
sp.] (Match DTV) 

>gill075610lpirllS52644 phycobilisome maturation protein - Synechococcus sp. 
gill42130 (M94218) phycobilisome maturation protein [Anacystis nidulans] 
gil446765 Iprfl 1191 229 1 J phycobilisome maturation protein [Synechococcus sp.] 
(Match DRV) 

>gil466182lsplP35151IYPUB_BACSU HYPOTHETICAL 7.2 KD PROTEIN IN 
PPIB-SIPS INTERGENIC REGION (ORFX1). gil6291 18lpirilS45538 hypothetical 
protein XI - Bacillus subtilis gil410120 (L09228) ORFX1 [Bacillus subtilis] 
(Match DRV) 

>gill42967 (M17642) succinate dehydrogenase [Bacillus subtilis] (Match DRV) 
>gill 18613lsplP08066IDHSB_BACSU SUCCINATE DEHYDROGENASE IRON- 
SULFUR PROTEIN. gill075923lpirllB27763 succinate dehydrogenase (EC 
1.3.99.1) iron-sulfur protein - Bacillus subtilis gill43527 (M13470) iron-sulfur 
protein [Bacillus subtilis] (Match DRV) 

>gil 144453 (M94320) very similar to DNA polymerase of Bacillus subtilis 
bacteriophage SP02; potential DNA polymerase; putative [Citrus greening disease- 
associated bacterium-like organism] (Match DLV) 
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>gil78587lpirllG25035 hypothetical protein 2 - Escherichia coli plasmid Colla 
gil455439 (M 138 19) ORF2 [Plasmid Colla] (Match DDV) 
>gil78588lpirllH25035 hypothetical protein 2 - Escherichia coli plasmid Collb 
gil455441 (M13820) ORF2 [Plasmid Collb] (Match DDV) 
>gill45313 (K01304) L-ribulokinase (araB) [Escherichia coli] (Match DSV) 
>gill20350lsplP26608IFLIS_ECOLI FLAGELLAR PROTEIN FLIS. gill45989 
(M85240) flagellar protein [Escherichia coli] (Match DPV) 
>gi!125924lsplP26593ILACD_LACLA TAGATOSE 1,6-DIPHOSPHATE 
ALDOLASE. gil97943lpirllD39778 LacD tagatose-l,6-diphosphate aldolase - 
Lactococcus lactis gil 149396 (M65190) lacD [Lactococcus lactis] gil 149409 
(M60447) tagatose 1,6-diP aldolase [Lactococcus lactis] (Match DKV) 
>gil68525lpirllSYEXI isoleucine-tRNA ligase (EC 6.1.1.5) - Methanobacterium 
thermoautotrophicum gil 149728 (M59245) transfer RNA-Ile synthetase 
[Methanobacterium thermoautotrophicum] (Match DKV) 
>gill50352 (M84113) ORF1 [Transposon mini-Tn3Cm] (Match DAV) 
>gill21875lsplP24375IGVPK_HALHA GVPK PROTEIN. gil81055lpirllJQ1128 
GvpK protein - Halobacterium halobium plasmid pNRClOO gil43524 (X55648) 
gvpK gene product [Halobacterium halobium] gil455299 (M58557) gas vesicle 
protein [Plasmid pNRClOO] (Match DDV) 

>gill27013lsplP13111IMERR_SERMA MERCURIC RESISTANCE OPERON 
REGULATORY PROTEIN. gil96175lpirllA33858 merR protein - Escherichia coli 
plasmid pDU1358 gil455313 (M24940) mercury resistance protein [Plasmid 
pDU1358] (Match DAV) 

>gil 150838 (K02336) EII enzyme (6-aminohexanoic acid linear oligomer 

hydrolase) [Plasmid pOAD2] (Match DAV) 

>gil294462 (M28607) insB [Escherichia coli] (Match DKV) 

>gill21389lsplP13556IGLNB_RHOCA NITROGEN REGULATORY PROTEIN 

P-II. gil 15 1934 (M28244) glutamine synthetase glnB (EC 6.3.1.2) [Rhodobacter 

capsulatus] gil829596 (U25953) PII protein [Rhodobacter capsulatus] (Match 

DAV) 

>gill35828lsplP27477ITHTR_SYNP7 PUTATIVE THIOSULFATE 
SULFURTRANSFERASE PRECURSOR (RHODANESE-LIKE PROTEIN). 
gil28021 1 Ipirll A43669 rhodanese homolog rhdA precursor - Synechococcus sp. 
gil 154604 (M65244) rhdA [Synechococcus sp.] (Match DRV) 
>gil731176lsplP40981IXYLR_THER8 PUTATIVE XYLOSE REPRESSOR. 
gil632297lpirllS41787 xylR protein - Thermophilic bacterium gil31 1 188 (LI 8965) 
putative xylose repressor gene; putative [Thermophilic bacterial sp.] (Match DYV) 
>gill 175762lsplP46015IYDEB_ANASP HYPOTHETICAL PROTEIN IN DEVB 
5'REGION. gil556606 (U14553) ORF [Anabaena sp.] (Match DYV) 
>gill072948lpirllS51047 mauR protein - Paracoccus denitrificans gil558803 
(U12464) LysR-type transcriptional activator [Paracoccus denitrificans] (Match 
DAV) 
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>gil629404lpirllS48833 cytochrome-c3 hydrogenase (EC 1.12.2.1) alpha chain - 
Pyrococcus furiosus gil563905 (X75255) hydrogenase (alpha subunit) [Pyrococcus 
furiosus] (Match DGV) 

>gill30794lsplP07781IPQQ2_ACICA COENZYME PQQ SYNTHESIS PROTEIN 
II. gil95318lpirllE32252 gene II protein - Acinetobacter calcoaceticus gil38744 
(X06452) gene II [Acinetobacter calcoaceticus] (Match DLV) 
>gill28258lsplP10996INIFE_CLOPA NITROGENASE IRON-MOLYBDENUM 
COFACTOR BIOSYNTHESIS PROTEIN NIFE. gil80505lpirllS04079 nitrogenase 
(EC 1.18.6.1) molybdenum-iron protein nifE - Clostridium pasteurianum gil40587 
(X13606) NifE protein (AA 1 - 456) [Clostridium pasteurianum] (Match DYV) 
>gil547614lsplP36553IHEM6_ECOLI COPROPORPHYRINOGEN III OXIDASE, 
AEROBIC (COPROPORPHYRINOGENASE) (COPROGEN OXIDASE), 
gil 1 073344lpirl IB 3 6964 coproporphyrinogen oxidase (EC 1.3.3.3), aerobic - 
Escherichia coli gil453969 (X75413) coproporphyrinogen oxidase [Escherichia 
coli] (Match DWV) 

>gil95681lpirllS06878 beta-Galactosidase (EC 3.2.1.23) - Escherichia coli 
(fragment) gil41904 (X16313) lacZ 5-region [Escherichia coli] (Match DGV) 
>gil78569lpirllS04774 hypothetical protein - Escherichia coli (fragment) gil42746 
(X15859) open reading frame (122 AA); pid:g42746 [Escherichia coli] (Match 
DQV) 

>gill29003lsplP07062INYLC_FLASP 6-AMINOHEXANOATE-DIMER 
HYDROLASE (NYLON OLIGOMERS DEGRADING ENZYME EH'). 
gil77554lpirllB22644 6-aminohexanoate-dimer hydrolase (EC 3.5.1.46) EIF - 
Flavobacterium sp. plasmid pOAD2 gil43420 (X02864) EH' (aa 1-392) 
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[Flavobacterium sp.] gil223804lprfll0912258B enzyme RSIIB,nylon degrading 
[Flavobacterium sp.] (Match DAV) 

>gil79956 Ipirl I JH0207 hypothetical 10.8K protein - Enterococcus faecalis plasmid 
pAM-beta-1 gil45739 (XI 7092) ORFF (ttg start codon) [Enterococcus faecalis] 
(Match DFV) 

>gilll4867lsplP26177IBCHX_RHOCA CHLOROPHYLLIDE REDUCTASE 35.5 
KD CHAIN (CHLORIN REDUCTASE). gil79513lpir!IS17823 protochlorophyllide 
reductase (EC 1 .3. 1.33) 35.5K chain - Rhodobacter capsulatus gil4613 1 (Zl 1 165) 
333 aa (35.5 kD) chlorophillide reductase subunit, also known as chlorophyll Fe 
protein [Rhodobacter capsulatus] (Match DDV) 

>gilll6927lsplP24716ICOPR_STRAG PLASMID COPY CONTROL PROTEIN 
COPR. gil98007lpirilS22829 hypothetical protein - Streptococcus agalactiae 
gil581557 (X62150) 92 aa polypeptide [Streptococcus agalactiae] gil769739 
(X72021) circular [Streptococcus agalactiae] (Match DFV) 
>gill34993lsplP09398ISTRG_STRGR STREPTOMYCIN BIOSYNTHESIS 
PROTEIN STRG. gi!80801 ipirllS 17777 strG protein - Streptomyces griseus 
gil49266 (Y00459) strG [Streptomyces griseus] (Match DTV) 
>gil401018lsplP31814IRPOB_THECE DNA-DIRECTED RNA POLYMERASE 
SUBUNIT B. gil2803 54lpirl IS255 63 DNA-directed RNA polymerase (EC 2.7.7.6) 
chain B - Thermococcus celer gil48140 (X67313) Subunit B of DNA-dependent 
RNA polymerase [Thermococcus celer] (Match DRV) 

>gil625666lpirllA36925 LysR-type transcriptional activator CbbR - Xanthobacter 
flavus gil581832 (Z22705) DNA-binding protein [Xanthobacter flavus] (Match 
DPV) 

>gil5 15608 (Z35397) C. sativus 3-ketoacyl-CoA thiolase [Arabidopsis thaliana] 
(Match DIV) 

>gil451328 (U02021) ecdysteroid receptor [Aedes aegypti] (Match DQV) 
>gil413919 (D21 101) Guanyl Cyclase [Hemicentrotus pulcherrimus] (Match DDV) 
>gil5 14269 (U07706) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505159 (Z30659) dihydropteroate synthetase [Plasmodium falciparum] 
gil505169 (Z30665) dihydropteroate synthetase [Plasmodium falciparum] 
gil505171 (Z30655) dihydropteroate synthetase [Plasmodium falciparum] 
gil505175 (Z30657) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505161 (Z30660) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gi!505163 (Z30653) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 
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>gil505165 (Z30664) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil630466lpirllS47 1 54 dihydropterin pyrophosphokinase/dihydropteroate 
synthetase - Plasmodium falciparum gil505179 (Z31584) Dihydropterin 
pyrophosphokinase and Dihydropteroate synthetase [Plasmodium falciparum] 
(Match DQV) 

>gil505167 (Z30654) dihydropteroate synthetase [Plasmodium falciparum] 
gil505173 (Z30656) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505177 (Z30658) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil585279lsplQ08 169IHUGA_APIME HYALURONOGLUCOSAMINIDASE 
PRECURSOR (HYALURONIDASE) (ALLERGEN API M II) (API M 2). 
gil476996lpirllA47477 hyaluronidase - honeybee gil 155680 (L10710) hyaluronidase 
[Apis mellifera] (Match DQV) 

>gil 159276 (M64611) putative [Hydra vulgaris] (Match DVV) 

>gil552162 (L28823) reverse transcriptase [Phlebotomus perniciosus] (Match 

DTV) 

>gil 160301 (Ml 52 12) glycophorin binding protein [Plasmodium falciparum] 
(Match DEV) 

>gilll8063lsplP16065ICYGS_STRPU SPERACT RECEPTOR PRECURSOR 
(GUANYLATE CYCLASE). gil279588lpirllOYURCP speract receptor precursor - 
sea urchin (Strongylocentrotus purpuratus) gil 16 1477 (M22444) guanylate cyclase 
[Strongylocentrotus purpuratus] (Match DDV) 

>gil556182 (L36665) ORF; putative [Gonyaulax polyedra] (Match DLV) 
>gill63188 (L06320) alpha-interferon receptor [Bos taurus] (Match DSV) 
>gil246581lbbsl86109 zona pellucida-binding protein, AWN-1=C13' fragment 
[swine, sperm, Peptide Partial, 10 aa] (Match DXV) 

>gil399217lsplP30932ICD9_BOVINCD9 ANTIGEN. gil89462lpirllJX0221 CD9 
antigen - bovine gil 162821 (M81720) CD9 antigen [Bos taurus] (Match DMV) 
>gil562100 (U 15975) putative brain ryanodine receptor [Sus scrofa] (Match DQV) 
>gil462415lsplQ04790IINRl_BOVIN INTERFERON-ALPHA/BETA RECEPTOR 
ALPHA CHAIN PRECURSOR (IFN-ALPHA-REC). gil346520lpirllS27387 
interferon alpha receptor type 1 - bovine gil432 (X68443) interferon receptor type 1 
[Bos taurus] (Match DSV) 
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>gill37049lsplP01 145IUR1_CATC0 UROTENSIN I. gil69066lpirllUOCClM 
urotensin I - white sucker gil268092lgblI02277l Sequence 2 from Patent US 
4528189 gil270944lgblI0 17221 Sequence 2 from Patent US 4908352 (Match DEV) 
>gil268113lgblI02366l Sequence 1 from Patent US 4533654 gil268114lgblI02367l 
Sequence 2 from Patent US 4533654 (Match DEV) 
>gil268397lgblI03062l Sequence 4 from Patent US 4605642 (Match DEV) 
>gil268996lgblI00642l Sequence 8 from Patent US 4742157 (Match DXV) 
>gil270945lgblI0 17241 Sequence 3 from Patent US 4908352 (Match DEV) 
>gil227991lprfill714327A urotensin I [Hippoglossoides elassodon] 
gil270946lgblI01726l Sequence 4 from Patent US 4908352 (Match DEV) 
>gil592318lgblll 1 1631 Sequence 4 from Patent WO 8906658 (Match DSV) 
>gil5931 18lgbll 103471 Sequence 3 from Patent WO 8705938 (Match DTV) 
>gil594746lgblI04467l Sequence 7 from Patent EP 0162738 (Match DGV) 
>gill32051lsplP00875IRBL_SPIOL RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68133lpirllRKSPL ribulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - spinach chloroplast 
gil231312lpdbl8RUBIL Ribulose 1,5-Bisphosphate Carboxylase(Slash)oxygenase 
(E.C.4.1.1.39) Complex With Co2,Mg++ And 2-Carboxyarabinitol- 1,5- 
Bisphosphate gill 2291 (V00168) ribulose 1,5-bisphophate carboxylase [Spinacia 
oleracea] gil343375 (J01443) ribulose bisphosphate carboxylase large subunit 
[Spinacia oleracea] (Match DTV) 

>gillll564lpirllS09074 cytochrome P450-4b - rat (fragment) (Match DGV) 
>gil82261lpirllS06161 chitinase (EC 3.2.1.14) - potato (fragment) gil21465 
(X 141 33) endochintinase (315 AA) [Solanum tuberosum] (Match DTV) 
>gil84502lpirllB28563 hemoglobin chain IV - earthworm (Lumbricus terrestris) 
(fragment) (Match DDV) 

>gil84636lpirllS00492 hemocyanin chain la - Japanese spiny lobster (fragment) 
(Match DDV) 

>gil320206lpirllS28389 acyl carrier protein - Escherichia coli (fragment) (Match 
DTV) 

>gil281333lpirllPQ0397 nonstructural protein NS5 - hepatitis C virus (isolate E- 
bl2) (fragment) (Match DPV) 

>gil538860lpirllA61213 photoreaction center protein H - Rhodospirillum rubrum 
gil227675 Iprfll 1 709 1 58B puh gene [Rhodospirillum rubrum] (Match DRV) 
>gi!97994lpirllG35905 hypothetical protein 1 (Sm2) - Streptococcus mutans (Match 
DIV) 

>gil79995lpirllA28551 hypothetical protein 1 - Streptococcus mutans (strain GS-5) 
gill 196925 (M18954) unknown protein [Streptococcus mutans] (Match DIV) 
>gil483018lpirllB47607 immunogenic protein MPB70/MPB80 - Mycobacterium 
bovis (strain BCG) (fragment) (Match DPV) 
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>gilll74853lsplP42743IUBCY_ARATH UBIQUITIN-CONJUGATING ENZYME 
E2-18 KD (UBIQUITIN-PROTEIN LIGASE) (UBIQUITIN CARRIER 
PROTEIN) (PM42). gil481811lpirllS39483 ubiquitin-conjugating enzyme UBC2-1 - 
Arabidopsis thaliana gil22658 (X68306) ubiquitin-conjugating enzyme 
[Arabidopsis thaliana] (Match DKV) 

>gil22549 1 Iprfll 1 30430 1 B glycoprotein S8 [Brassica rapa] (Match DLV) 
>gill37055lspllURl_PLAFE_2 [Segment 2 of 2] UROTENSIN I PRECURSOR. 
gil280657lpirllA43978 urotensin I - European flounder gil2273 17iprfll 1701464A 
urotensin I [Platichthys flesus] (Match DEV) 

>gil87715lpirllPH0159 HLA-DRB sigma antigen DRB1-0701-Dwl7 - human 
(Match DTV) 

>gil87718lpirllPT0162 HLA-DRB sigma antigen DRB1-0901-Dw23 - human 
(Match DTV) 

>gil91588lpirllPT0641 T-cell receptor beta chain V-D-J region (120-2R) - mouse 
(fragment) (Match DWV) 

>gi!48 1 922lpirllS40 1 64 hemagglutinin-neuraminidase - Newcastle disease virus 
gil437889 (X71994) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gilll69937lsplP43519IGLNB_RHOSH NITROGEN REGULATORY PROTEIN 
P-II (PII SIGNAL TRANSDUCING PROTEIN). gil421339lpirllS33180 glnB 
protein - Rhodobacter sphaeroides gil809751 (X71659) glnB gene product 
[Rhodobacter sphaeroides] gill586928lprfll2205239A Glu synthetase [Rhodobacter 
sphaeroides] (Match DAV) 

>gil98843lpirllS 14091 40K protein - Saccharopolyspora erythraea (Match DAV) 
>gi 1479 1 79 Ipirl I S3243 8 pol polyprotein - Volvox carteri retrotransposon VCRT-I-1 
(fragment) gil938289 (X69621) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil479 181 lpirllS32440 pol polyprotein - Volvox carteri retrotransposon VCRT-I-3 
(fragment) gil938291 (X69623) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil479183lpirllS32442 pol polyprotein - Volvox carteri retrotransposon VCRT-I-6 
(fragment) gil938294 (X69626) reverse transcriptase [Volvox carteri] (Match 
DDV) 
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>gil479184lpirllS32443 pol polyprotein - Volvox carteri retrotransposon VCRT-I-8 
(fragment) (Match DDV) 

>gil479185lpirllS32444 pol polyprotein - Volvox carteri retrotransposon VCRT-II-1 
(fragment) gil938295 (X69629) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil479188lpirllS32447 pol polyprotein - Volvox carteri retrotransposon VCRT-II-4 
(fragment) gil938298 (X69632) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil479190lpirllS32449 pol polyprotein - Volvox carteri retrotransposon VCRT-II-3 
(fragment) gil938297 (X69631) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil477748lpirllB47759 reverse transcriptase (copia-like retrotransposon) - upland 
cotton (fragment) gill 673 17 (M94472) reverse transcriptase [Gossypium hirsutum] 
(Match DDV) 

>gil 10763 16lpirllS5 1478 Dil9 protein - Arabidopsis thaliana gil469110 (X78584) 
Dil9 [Arabidopsis thaliana] (Match DEV) 

>gil99777lpirl IS 14951 S-locus-specific glycoprotein SLG-8 - field mustard 
gil 17708 (X55274) S-locus glycoprotein [Brassica campestris] (Match DLV) 
>gil478421lpirllJQ2380 S-locus-specific glycoprotein precursor - rape 
gill076455lpirllS42280 S-locus glycoprotein - rape gill67170 (L08608) S-locus 
glycoprotein [Brassica napus] gil904227 (L10736) S-locus related glycoprotein 
[Brassica napus] (Match DLV) 

>gil99826lpirllS24546 S-locus glycoprotein - rape gill7868 (Z11725) S-locus 
glycoprotein [Brassica napus] (Match DLV) 

>gil434858 (X76472) pid:g434858 [Crucianella angustifolia] (Match DAV) 
>gil478565lpirllS 10849 alpha-amylase/trypsin inhibitor - durum wheat (Match 
DYV) 

>gil 1 1 7275 1 lsplP4 1 390IPUR l_SCHPO 

AMIDOPHOSPHORIBOSYLTRANSFERASE (GLUT AMINE 
PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE) (ATASE). 
gil481335lpirllS38482 amidophosphoribosyltransferase (EC 2.4.2.14) - fission yeast 
(Schizosaccharomyces pombe) gil629904lpirllS43526 PRPP amidotransferase (EC 
2.4.2.14) - yeast (Schizosaccharomyces pombe) gil410512 (X72293) PRPP 
amidotransferase [Schizosaccharomyces pombe] (Match DFV) 
>gil542640lpirll A488 1 0 fibrinogen B beta subunit - African clawed frog (fragment) 
gil450951 (U05035) fibrinogen B-beta subunit [Xenopus laevis] (Match DDV) 
>gil477549lpirll A49 192 transthyretin - bullfrog (fragment) gil299846lbbsl 130235 
transthyretin, T-T3BP=3,5,3'-L-triiodothyronine-specific binding protein {N- 
terminal) [bullfrogs, tadpole plasma, Peptide Partial, 19 aa] (Match DAV) 
>gil481489lpirllS38695 class II histocompatibility antigen betea chain - slender 
loris (fragment) (Match DTV) 
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>gil478168lpirllE49164 chromogranin-B - rat (fragment) gil239365lbbsl66367 
chromogranin-B, CgB=glucagonoma peptide [rats, Peptide Partial, 38 aa] (Match 
DNV) 

>gil543521lpirllB61222 cytochrome-c oxidase (EC 1.9.3.1) chain II - 
mitochondrion Steinernema intermedii (SGC4) (fragment) (Match DEV) 
>gil543672lpirllJQ2350 protein kinase (EC 2.7.1.37) - turkey herpesvirus gil406788 
(X68653) protein kinase homologue [Gallid herpesvirus 2] gil5838 1 1 ( A18267) 
ORF5 [Gallid herpesvirus 2] gill253294lpatlUSI5470734l5 Sequence 5 from patent 
US 5470734 (Match DSV) 

>gil478188lpirllF47758 reverse transcriptase (copia-like retrotransposon) - 
Liriodendron chinense (fragment) gill 68306 (M94477) reverse transcriptase 
[Liriodendron chinense] (Match DDV) 

>gilll6359lsplP23472ICHLY_HEVBR HEV AMINE A (CHITINASE / 
LYSOZYME. gil82026lpirllS 17205 chitinase (EC 3.2.1.14) hevamine - Para rubber 
tree gil234388lbbsl52808 hevamine [Hevea brasiliensis, Peptide Partial, 273 aa] 
gill311006lpdbllHVQI Glycosidase, Chitin Degradation, Multifunctional Enzyme 
Mol_id: 1; Molecule: Hevamine A; Chain: Null; Ec: 3.2.1.14, 3.2.1.17; Heterogen: 
N-,N , -,N"-Triacetyl-Chitotriose; Other.details: Plant EndochitinaseLYSOZYME 
gill31 1007lpdbllHVMI Glycosidase, Chitin Degradation, Multifunctional Enzyme 
Mol_id: 1; Molecule: Hevamine A; Chain: Null; Ec: 3.2.1.14, 3.2.1.17; 
Other_details: Plant EndochitinaseLYSOZYME gill421554lpdbllLLOI Hevamine 
A (A Plant EndochitinaseLYSOZYME) COMPLEXED WITH Allosamidin 
Chitinase, Lysozyme Mol_id: 1; Molecule: Hevamine; Chain: Null; Synonym: 
ChitinaseLYSOZYME; Ec: Ec 3.2.1.14, 3.2.1.17; Heterogen: Allosamidin (Match 
DSV) 

>gil467822 (U02606) chitinase [Solanum tuberosum] (Match DTV) 
>gil6297 1 7 lpirl!S43 317 chitinase (EC 3.2.1.14) - potato (fragment) gil467824 
(U02607) chitinase [Solanum tuberosum] (Match DTV) 
>gil46791 1 (U03086) ribulose-l,5-bisphosphate carboxylase/oxygenase large 
subunit [Sarcothalia decipiens] (Match DW) 

>gil514215 (U02963) dynein beta heavy chain [Chlamydomonas reinhardtii] 
(Match DW) 
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>gil516552 (U10078) cyclin IbZm [Zea mays] (Match DLV) 
>gilll70247lsplP43082IHEVL_ARATH HEVEIN-LIKE PROTEIN PRECURSOR. 
gil407248 (U01880) pre-hevein-like protein [Arabidopsis thaliana] (Match DRV) 
>gil625982lpirllJC2250 S-locus-specific glycoprotein S12 precursor - field mustard 
gil547238lbbsl 149323 (S70937) S-glycoprotein [Brassica campestris, S12S12 
homozygotes, stigmas. Peptide, 436 aa] gil743639lprfll2013216A S glycoprotein 
[Brassica rapa] (Match DLV) 

>gil289868 (L12640) ribulose 1,5-bisphosphate carboxylase large subunit 
[Chloranthus japonicus] (Match DTV) 

>gil460648 (L29492) ribulose 1,5 bisphosphate carboxylase [Comesperma 
ericinum] (Match DTV) 

>gil290939 (LI 2649) ribulose 1,5-bisphosphate carboxylase large subunit 
[Hedyosmum arborescens] (Match DTV) 

>gil3 10368 (L19972) ribulose 1,5-bisphosphate carboxylase [Stegolepis allenii] 
(Match DKV) 

>gil484236 (L05041) ribulose 1,5-bisphosphate carboxylase large subunit 
[Tradescantia sp.] (Match DKV) 

>gil 166459 (L06946) beta-tubulin [Acremonium uncinatum] (Match DAV) 
>gill66467 (L06954) beta-tubulin [Acremonium sp.] gil 168130 (L06959) beta- 
tubulin [Epichloe amarillans] (Match DAV) 

>gilll9975lsplP16972IFER_ARATH FERREDOXIN PRECURSOR. 
gil99692lpirllS09979 ferredoxin [2Fe-2S] precursor - Arabidopsis thaliana gill6437 
(X51370) ferredoxin precursor [Arabidopsis thaliana] gill66698 (M35868) 
ferrodoxin A [Arabidopsis thaliana] (Match DIV) 
>gill67172 (M36301) S-6-glycoprotein [Brassica campestris] 
gil225490lprflll304301A glycoprotein S6 [Brassica rapa] (Match DLV) 
>gil 166461 (L06951) beta-tubulin [Acremonium coenophialum] gil 166463 
(L06952) beta-tubulin [Acremonium sp.] gil 166469 (L06963) beta-tubulin 
[Acremonium sp.] gil 166471 (L06964) beta-tubulin [Acremonium coenophialum] 
gill68122 (L06955) beta-tubulin [Epichloe festucae] gil 168 124 (L06956) beta- 
tubulin [Epichloe festucae] gill68126 (L06957) beta-tubulin [Epichloe festucae] 
gill68128 (L06958) beta-tubulin [Epichloe amarillans] gill68133 (L06961) beta- 
tubulin [Epichloe amarillans] gil 168 135 (L06962) beta-tubulin [Epichloe sp.] 
(Match DAV) 

>gill69359 (J01262) phaseolin [Phaseolus vulgaris] gil897800 (V01163) phaseolin 
[Phaseolus vulgaris] (Match DDV) 

>gil457400 (D21840) MAP kinase [Arabidopsis thaliana] (Match DSV) 
>gil3 10372 (L13485) ribulosebisphosphate carboxylase [Sphagnum palustre] 
(Match DTV) 

>gil309636 (LI 1058) 'Ribulosebiphosphate Carboxylase' [Ophioglossum 
engelmanii] (Match DTV) 

>gil3815 (X00788) 1G2 protein [Schizophyllum sp.] (Match DPV) 
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>gil547991lsplP36606INAH_SCHPO PROBABLE NA(+)/H(+) ANTIPORTER. 
gil82816lpirllS20951 Na+/H+ antiporter - fission yeast (Schizosaccharomyces 
pombe) gil5090 (Z11736) putative sodium/proton antiporter [Schizosaccharomyces 
pombe] (Match DYV) 

>gill34531lsplP22553ISLS2_BRAOA S-LOCUS-SPECIFIC GLYCOPROTEIN 
BS29-2 PRECURSOR. gill7889 (X16123) S locus specific glycoprotein [Brassica 
oleracea] (Match DLV) 

>gil 17894 (X55275) S-locus glycoprotein [Brassica oleracea] (Match DLV) 
>gill34534lsplP07761ISLS6_BRAOL S-LOCUS-SPECIFIC GLYCOPROTEIN S6 
PRECURSOR (SLSG-6). gil81703lpirllA27827 S-locus-specific glycoprotein S6 
precursor - wild cabbage gill 7901 (Y00268) SLSG (AA -31 to 405) [Brassica 
oleracea] gil225542lprflll305350A protein,S locus allele [Brassica oleracea var. 
botrytis] (Match DLV) 

>gil436130 (X76634) ribulose-l,5-bisphosphate carboxylase [Physcomitrella 
patens] (Match DTV) 

>gill346963lsplP20455IRBL_ATRRS RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil99516lpirllF34921 ribulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain - Atriplex rosea chloroplast 
gilll323 (X55831) rubisco large subunit [Atriplex rosea] (Match DTV) 
>gill31998lsplP19163IRBL_NEUMU RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68147lpirllRKNULM 
ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - Neurachne 
munroi chloroplast gill00640lpirlIH34921 ribulose-bisphosphate carboxylase (EC 
4.1.1.39) large chain - Neurachne munroi chloroplast gill 1751 (X55828) rubisco 
large subunit [Neurachne munroi] (Match DKV) 
>gill31999lsplP19164IRBL_NEUTE RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68146lpirllRKNULT 
ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - Neurachne 
tenuifolia chloroplast gi i 1 0064 1 Ipirl IG3492 1 ribulose-bisphosphate carboxylase (EC 
4.1.1.39) large chain - Neurachne tenuifolia chloroplast gill 1798 (X55827) rubisco 
large subunit [Neurachne tennifolia] (Match DKV) 
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>gil299258lbbsl 127093 (S56181) pyruvate dehydrogenase alpha subunit {C- 
terminal} {EC 1.2.4.1 } [human, Peptide Partial Mutant, 14 aa] (Match DQV) 
>gil385595lbbsl 133340 (S62078) platelet-derived growth factor A-chain, PDGF A- 
chain {N-terminal} [human, Peptide Partial, 53 aa] (Match DSV) 
>gill24884lsplP16808IIR10_HCMVA HYPOTHETICAL PROTEIN IRL10 
PRECURSOR (TRL10). gil76487lpirllS09903 hypothetical protein IRL10 precursor 
- human cytomegalovirus (strain AD169) gil833108 (X17403) HCMVIRL10 = 
TRL10 (AA 1-171) [Human cytomegalovirus] (Match DNV) 
>gill34532lsplP17840ISLS3JBRAOL S-LOCUS-SPECIFIC GLYCOPROTEIN 
S13 PRECURSOR (SLSG-13). gil81698lpirllB27827 S-locus-specific glycoprotein 

513 precursor - wild cabbage (fragment) (Match DLV) 

>gill34533lsplP17841ISLS4JBRAOL S-LOCUS-SPECIFIC GLYCOPROTEIN 

514 PRECURSOR (SLSG-14). gi 1 8 1 699lpirllC27 8 27 S-locus-specific glycoprotein 
S14 precursor - wild cabbage (fragment) (Match DIV) 

>gil267240lsplP30088IUPA2_HUMAN UNKNOWN PROTEIN FROM 2D-PAGE 
OF PLASMA (SPOT 10). (Match DQV) 

>gill 17097lsplP00426ICOXA_BOVIN CYTOCHROME C OXIDASE 
POLYPEPTIDE VA. gil66277lpirllCABO cytochrome-c oxidase (EC 1.9.3.1) chain 
Va - bovine gil229632lprfll77 1 727 A oxidase heme a,cytochrome [Bos taurus] 
(Match DKV) 

>gill26902lsplP80040IMDH_CHLAU MALATE DEHYDROGENASE. (Match 
DIV) 

>gill26903lsplP80039IMDH_CHLTE MALATE DEHYDROGENASE. (Match 
DVV) 

>gill26906lsplP80037IMDH_HELGE MALATE DEHYDROGENASE. (Match 
DIV) 

>gill31906lsplP00879IRBL_ANASP RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68158ipirllRKAIL7 ribulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain - Anabaena sp. 
gil223640lprfll0904327A carboxylase,RBP [Anabaena sp.] (Match DTV) 
>gil417995lsplP30138ITHIF_ECOLI THIF PROTEIN. (Match DPV) 
>gill36991lsplP16787IUL96_HCMVA HYPOTHETICAL PROTEIN UL96. 
gil76602lpirllS09861 hypothetical protein UL96 - human cytomegalovirus (strain 
AD169) gil833080 (X17403) HCMVUL96 (AA 1-115) [Human cytomegalovirus] 
(Match DAV) 

>gill37504lsplP21075IVB17_VACCC PROTEIN B17. gil93309lpirllG42527 B17L 
protein - vaccinia virus (strain Copenhagen) gil335564 (M35027) B17L; putative 
[Vaccinia virus] (Match DNV) 

>gil267281lsplQ01221IVB17_VACCV PROTEIN B17. gil3 2 1 39 1 Ipirll JQ 1810 
B16L protein - vaccinia virus (strain WR) gil222761 (Dl 1079) 39.5K protein 
[Vaccinia virus] (Match DNV) 
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>gill38210lsplP18538IVGLB_HSVMD GLYCOPROTEIN B PRECURSOR. 
gil73946lpirllVGBERB glycoprotein B precursor - Marek's disease virus (strain 
RB1B) gil221837 (D13713) glycoprotein B precursor [Gallid herpesvirus type 1] 
gill 100890 (U39846) glycoprotein B [Gallid herpesvirus 2] (Match DAV) 
>gil547619lsplP12554IHEMA_NDVA HEMAGGLUTININ-NEURAMINIDASE. 
gil67467lpirllHNNZAV hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain Australia- Victoria virulent) (Match DGV) 
>gil547620lsplP35740IHEMA_NDVC HEMAGGLUTININ-NEURAMINIDASE. 
gil419457lpirllC36829 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain CHI/85) gil332352 (M24716) hemagglutinin-neuraminidase 
[Newcastle disease virus] (Match DRV) 

>gil547621lsplP35741IHEMA_NDVH3 HEMAGGLUTININ-NEURAMINIDASE. 
gil419459lpirllA36829 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain HER/33) (Match DGV) 

>gil 122996lsplP12556IHEMA_NDVI HEMAGGLUTININ-NEURAMINIDASE. 
gil77139lpirllS07126 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle disease 
virus (strain Italien) gil332362 (Ml 8640) hemagglutinin-neuraminidase [Newcastle 
disease virus] gil226 1 5 8lprfll 14131 94 A hemagglutinin neuraminidase [Newcastle 
disease virus] (Match DGV) 

>gil547622lsplP35742IHEMA_NDVJ HEMAGGLUTININ-NEURAMINIDASE. 
gil419460lpirllD36829 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain IB A/85) gi!332354 (M24717) hemagglutinin-neuraminidase 
[Newcastle disease virus] (Match DRV) 

>gill22997lsplP12557IHEMA_NDVM HEMAGGLUTININ-NEURAMINIDASE. 
gil332368 (Ml 9479) hemagglutinin-neuraminidase glycoprotein [Newcastle 
disease virus] (Match DKV) 

>gill35128lsplP26499ISYI_METTH ISOLEUCYL-TRNA SYNTHETASE 
(ISOLEUCINE--TRNA LIGASE) (ILERS). (Match DKV) 
>gil401222lsplP31779ITTHY_RANCA TRANSTHYRETIN (PREALBUMIN) 
(TADPOLE T3-BINDING PROTEIN) (T-T3BP). (Match DAV) 
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>gil465058lsplP33878IVB17_VARV PROTEIN B17. gil419242lpirllI36856 B18L 
protein - variola virus (strain India-1967) gil628217lpirllS46875 gene B17L protein 
(COP) - variola virus gil439093 (L22579) homolog of vaccinia virus CDS B17L; 
putative [variola major virus] gil457077 (X69198) pid:g457077 [Variola virus] 
gil516436 (X67117) B17L COP gene product [Variola virus] gil885783 (U18339) 
D6L [Variola virus] gil885845 (U18341) B15L [Variola virus] gill 150675 
(X72086) ORF17L; B18L in citation [3] [Variola virus] gil745309lprfll2015436HK 
B18L gene [Variola major virus] (Match DNV) 

>gill38975lsplP15775IVSMP_CVBF PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil74875lpirllMNIHB3 nonstructural protein NS3 - bovine coronavirus 
(strain F15) gil58686 (X51347) NS3 protein (AA 1-84) [Bovine coronavirus] 
(Match DDV) 

>gill38976lsplP15779IVSMP_CVBM PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil418984lpirllD46346 nonstructural protein NS3 - bovine coronavirus 
(strain Mebus) gil323368 (M31054) nonstructural 9.7 kDa protein (put.); putative 
[Bovine coronavirus] (Match DDV) 

>gi!465439lsplQ04854IVSMP_CVHOC PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.5 KD 
PROTEIN). gil47639 1 Ipirl IB44275 nonstructural protein NS3 - human coronavirus 
(strain OC43) gil329569 (M99576) 9.5 kDa nonstructural protein [Human 
coronavirus] (Match DDV) 

>gil549520lsplP36566IYCBD_ECOLI HYPOTHETICAL 29.8 KD PROTEIN IN 
KDSB-KICB INTERGENIC REGION. gill261828 (D26440) S- 
adenosylmethionine-dependent methltransferase [Escherichia coli] 
gill585880lprfll220221 1A Met(S-adenosyl)-dependent methyltransferase 
[Escherichia coli] (Match DKV) 

>gil465867lsplP34403IYLU9_CAEEL HYPOTHETICAL 14.8 KD PROTEIN 
F10E9.9 IN CHROMOSOME III. (Match DTV) 

>gilll9932lsplP00229IFERl_PHYAM FERREDOXIN I. gil65749lpirllFEFWl 
ferredoxin [2Fe-2S] I - Virginian pokeweed (Match DIV) 

>gilll9959lsplP14938IFER3_RAPSA FERREDOXIN, LEAF L-A. (Match DMV) 
>gill30608lsplP05960IPOL_HVlC4 POL POLYPROTEIN (PROTEASE 
(RETROPEPSIN) ; REVERSE TRANSCRIPTASE ; RIBONUCLEASE H. (Match 
DEV) 

>gill31765lsplP21760IQSP_CHICK QUIESCENCE-SPECIFIC PROTEIN 
PRECURSOR (P20K) (CH21 PROTEIN). gil86417lpirllA30230 quiescence- 
specific protein precursor - chicken (Match DEV) 

>gil208939 (M14181) preproparathyroid hormone [Artificial gene] gil209049 
(M14182) preproparathyroid hormone [Artificial gene] gil209052 (M14183) 
preproparathyroid hormone [Artificial gene] (Match DMV) 
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>gil209048 (M 14 182) synthetic preproparathyroid hormone [Artificial gene] 
gil209051 (M 14 183) synthetic preproparathyroid hormone [Artificial gene] (Match 
DMV) 

>gil344735 (A04054) MDV gB gene product [unidentified] gil4 12763 (A06147) gB 

gene product [unidentified] (Match DAV) 

>gil221553 (D10134) NS-5 [Hepatitis C virus] (Match DPV) 

>gil234099lbbs 152 1 40 NS3 protein [bovine enteritic coronavirus BECV, strain F15, 

Peptide, 84 aa] (Match DDV) 

>gil256415lbbsl 114657 VP3=major structural polypeptide {N-terminal} [infectious 
flacherie virus IFV, silkworm Bombyx mori, Peptide Partial, 15 aa] (Match DIV) 
>gil454753 (U04469) polymerase [Desert Shield virus] (Match DGV) 
>gil 1 364 1 35lpirllE49600 probable aphid transmission factor - soybean dwarf virus 
gil436022 (L24049) coat protein [Soybean dwarf virus] (Match DLV) 
>gil471720 (U01886) gB homolog [Gallid herpesvirus 2] (Match DAV) 
>gil323678 (M60583) ORF 1; putative [Densovirus of Bombyx type 1] (Match 
DYV) 

>gil305785 (L19242) glycoprotein 120 [Human immunodeficiency virus type 1] 
(Match DPV) 

>gil385141 (L23451) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil385143 (L23452) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil385149 (L23455) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil332344 (M24712) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332346 (M24713) hemagglutinin- neuraminidase [Newcastle disease virus] 
(Match DKV) 

>gil332348 (M24714) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332350 (M24715) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332360 (M22110) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil457315 (L23828) RNA polymerase [Norwalk virus] (Match DGV) 
>gil295510 (L07937) 37 kDa protein [Soil-borne wheat mosaic virus] (Match 
DSV) 
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>gil465058lsplP33878IVB17_VARV PROTEIN B 17. gil419242lpirllI36856 B18L 
protein - variola virus (strain India-1967) gil628217lpirllS46875 gene B17L protein 
(COP) - variola virus gil439093 (L22579) homolog of vaccinia virus CDS B17L; 
putative [variola major virus] gil457077 (X69198) pid:g457077 [Variola virus] 
gil516436 (X67117) B17L COP gene product [Variola virus] gil885783 (U18339) 
D6L [Variola virus] gil885845 (U18341) B15L [Variola virus] gill 150675 
(X72086) ORF17L; B18L in citation [3] [Variola virus] gil745309lprfll2015436HK 
B18L gene [Variola major virus] (Match DNV) 

>gill38975lsplP15775IVSMP_CVBF PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil74875lpirllMNIHB3 nonstructural protein NS3 - bovine coronavirus 
(strain F15) gii58686 (X51347) NS3 protein (AA 1-84) [Bovine coronavirus] 
(Match DDV) 

>gill38976lsplP15779IVSMP_CVBM PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil418984lpirllD46346 nonstructural protein NS3 - bovine coronavirus 
(strain Mebus) gil323368 (M31054) nonstructural 9.7 kDa protein (put.); putative 
[Bovine coronavirus] (Match DDV) 

>gil465439lsplQ04854IVSMP_CVHOC PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.5 KD 
PROTEIN). gil476391lpirllB44275 nonstructural protein NS3 - human coronavirus 
(strain OC43) gil329569 (M99576) 9.5 kDa nonstructural protein [Human 
coronavirus] (Match DDV) 

>gil549520lsplP36566IYCBD_ECOLI HYPOTHETICAL 29.8 KD PROTEIN IN 
KDSB-KICB INTERGENIC REGION, gil 126 1 828 (D26440) S- 
adenosylmethionine-dependent methltransferase [Escherichia coli] 
gill585880lprfll220221 1 A Met(S-adenosyl)-dependent methyltransferase 
[Escherichia coli] (Match DKV) 

>gil465867lsplP34403IYLU9_CAEEL HYPOTHETICAL 14.8 KD PROTEIN 
F10E9.9 IN CHROMOSOME III. (Match DTV) 

>gilll9932lsplP00229IFERl_PHYAM FERREDOXIN I. gil65749lpirllFEFWl 
ferredoxin [2Fe-2S] I - Virginian pokeweed (Match DIV) 

>gilll9959lsplP14938IFER3_RAPSA FERREDOXIN, LEAF L-A. (Match DMV) 
>gill30608lsplP05960IPOL_HVlC4 POL POLYPROTEIN (PROTEASE 
(RETROPEPSIN) ; REVERSE TRANSCRIPTASE ; RIBONUCLEASE H. (Match 
DEV) 

>gill3 1765lsplP21760IQSP_CHICK QUIESCENCE-SPECIFIC PROTEIN 
PRECURSOR (P20K) (CH21 PROTEIN). gil864 1 7lpirll A30230 quiescence- 
specific protein precursor - chicken (Match DEV) 

>gil208939 (M14181) preproparathyroid hormone [Artificial gene] gil209049 
(M14182) preproparathyroid hormone [Artificial gene] gil209052 (M 141 83) 
preproparathyroid hormone [Artificial gene] (Match DMV) 
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>gil209048 (M14182) synthetic preproparathyroid hormone [Artificial gene] 
gil209051 (M 14 183) synthetic preproparathyroid hormone [Artificial gene] (Match 
DMV) 

>gil344735 (A04054) MDV gB gene product [unidentified] gil412763 (A06147) gB 
gene product [unidentified] (Match DAV) 
>gil221553 (D10134) NS-5 [Hepatitis C virus] (Match DPV) 
>gil234099lbbsl52140 NS3 protein [bovine enteritic coronavirus BECV, strain F15, 
Peptide, 84 aa] (Match DDV) 

>gil256415lbbsl 114657 VP3=major structural polypeptide {N-terminal} [infectious 
flacherie virus IFV, silkworm Bombyx mori, Peptide Partial, 15 aa] (Match DIV) 
>gil454753 (U04469) polymerase [Desert Shield vims] (Match DGV) 
>gill364135lpirilE49600 probable aphid transmission factor - soybean dwarf virus 
gil436022 (L24049) coat protein [Soybean dwarf virus] (Match DLV) 
>gil471720 (U01886) gB homolog [Gallid herpesvirus 2] (Match DAV) 
>gil323678 (M60583) ORF 1; putative [Densovirus of Bombyx type 1] (Match 
DYV) 

>gil305785 (L19242) glycoprotein 120 [Human immunodeficiency virus type 1] 
(Match DPV) 

>gil385141 (L23451) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil385143 (L23452) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil385149 (L23455) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil332344 (M24712) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332346 (M24713) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DKV) 

>gil332348 (M24714) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332350 (M24715) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332360 (M221 10) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil457315 (L23828) RNA polymerase [Norwalk virus] (Match DGV) 
>gil295510 (L07937) 37 kDa protein [Soil-borne wheat mosaic virus] (Match 
DSV) 



FIG. 8M-2 



SUBSTITUTE SHEET (RULE 26) 



WO 98/23781 



PCT/US97/21861 



42/50 

>gil4331 13 (U03762) multigene family 360 protein [African swine fever virus] 
(Match DTV) 

>gil76494lpirllS09759 hypothetical protein TRL10 precursor - human 
cytomegalovirus (strain AD 169) gil59601 (XI 7403) HCMVTRL10 = IRL10 (AA 
1-171) [Human cytomegalovirus] (Match DNV) 
>gil2 11503 (M55644) marker protein [Gallus gallus] (Match DEV) 
>gil576796 (M25784) quiescence-specific protein [Gallus gallus] (Match DEV) 
>gil509165 (X70945) cellular retinoic acid binding protein I [Ambystoma 
mexicanum] (Match DDV) 

>gil227060lprflll613430A rimK assocd ORF [Escherichia coli] (Match DQV) 
>gil76336lpirllCOSJS 1G2 protein - bracket fungus (Schizophyllum commune) 
gil224 1 50lprfI1 101 1 193A 1G2 gene ORF [Schizophyllum commune] (Match DPV) 
>gill346547lsplP48040IMLlA_SHEEP MELATONIN RECEPTOR TYPE 1A 
(MEL-1A-R). gil602132 (U14109) Mel-la melatonin receptor [Ovis aries] (Match 
DSV) 

>gil625362lpirllA61338 flavodoxin - Clostridium pasteurianum (fragment) (Match 
DW) 

>gi!625983lpirllJC2251 S-locus-specific glycoprotein S8 precursor - field mustard 
gill304011 (D84468) SLG8 [Brassica campestris] (Match DLV) 
>gil628958lpirllS45092 cops protein - Streptococcus pyogenes gill333835 
(X66468) copS gene product [Streptococcus pyogenes] (Match DFV) 
>gil629545lpirllS40470 protein kinase 4, mitogen-activated - Arabidopsis thaliana 
(Match DSV) 

>gil 1 07646 llpirllS5 1139 S locus glycoprotein - wild cabbage gil624941 (X79431) S 
locus glycoprotein [Brassica oleracea] (Match DLV) 

>gil60 181 2lbbsl 1 5 1 834 (S7201 1) P14=small low-abundant nonstructural protein 
[bacteriophage, phi 6, Peptide. 62 aa] (Match DGV) 

>gil632906lbbsl 152232 RNA polymerase [human enteric calicivirus HCV, Peptide 
Partial, 54 aa] (Match DGV) 

>gil676884 (D29681) The expression is induced by Pi starvation. [Nicotiana 
tabacum] gil 10948 19lprfll2106387C Al-induced protein [Nicotiana tabacum] 
(Match DRV) 

>gil729540lsplP80348IFUC2_RAT FUCTININ 2 (FUCOSYLTRANSFERASE 
INHIBITOR 2). gil639583lbbsl 155067 fuctinin peptide 2=fucosyltransferase 
inhibitor {N-terminal} [rats, small intestinal mucosa, Peptide Partial, 22 aa] (Match 
DEV) 

>gil755077 (L34630) membrane protein [Synechocystis sp.] gill653000 (D90910) 
Mn transporter MntB [Synechocystis sp.] (Match DQV) 
>gil765093 (D50053) ORF5 [Orgyia pseudotsugata nuclear polyhidrosis virus] 
gill584397lprfll2122421B ORF 5 [Orgyia pseudotsugata nuclear polyhidrosis 
virus] (Match DKV) 
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>gil765256lbbsl 156682 (S73813) lymphoid cell activation antigen, 
CD39=guanosine diphosphatase homolog [human, B lymphoblastoid cell line, MP- 
1, Peptide, 510 aa] (Match DMV) 

>gil 10839 16lpirllJC2572 hypothetical 18K protein - Leuconostoc oenos phage L10 

gil806612 (L13035) ORFA [Bacteriophage L10] (Match DDV) 

>gil808689 (M 19004) unknown protein [Saimirine herpesvirus 1] (Match DWV) 

>gil261755lbbsll22153 aconitase, iron-responsive element binding protein, IRE-BP 

{EC 4.2.1.3} [cattle, liver cytosol, Peptide Partial, 1 1 aa, segment 4 of 6] (Match 

DVV) 

>gil544869lbbsl 142782 beta-glucosidase [Hordeum vulgare=barley, Sofia, Peptide 
Partial, 15 aa, segment 2 of 6] (Match DGV) 

>gil400168lspllLCAT_PIG_10 [Segment 10 of 11] PHOSPHATIDYLCHOLINE- 
STEROL ACYLTRANSFERASE PRECURSOR (LECITHIN-CHOLESTEROL 
ACYLTRANSFERASE) (PHOSPHOLIPID-CHOLESTEROL 
ACYLTRANSFERASE). (Match DPV) 

>gil400776lspllPHLD_HUMAN_6 [Segment 6 of 8] PHOSPHATIDYLINOSITOL- 
GLYCAN-SPECIFIC PHOSPHOLIPASE D (PI-G PLD) (GLYCOPROTEIN 
PHOSPHOLIPASE D). (Match DXV) 

>gil860940 (X78951) core protein [Hepatitis C virus] (Match DGV) 
>gil881414 (U27512) trichocyst matrix protein T4 [Paramecium tetraurelia] 
gil881416 (U27513) trichocyst matrix protein T4 [Paramecium tetraurelia] (Match 
DKV) 

>gil881418 (U27514) trichocyst matrix protein T4 [Paramecium tetraurelia] (Match 
DKV) 

>gill361418lpirllS57659 argininosuccinate synthase (EC 6.3.4.5) - Streptomyces 
clavuligerus gil886906 (Z491 11) argininosuccinate synthetase [Streptomyces 
clavuligerus] gil 15865 lllprfll2204224A argininosuccinate synthetase 
[Streptomyces clavuligerus] (Match DLV) 

>gil899227 (X03170) SLSG (COOH end); pid:e 188274 [Brassica oleracea] (Match 
DLV) 

>gil913953lbbsl 164394 threonine dehydrogenase, TDH {N-terminal} {EC 
1.1.1.103} [Clostridium sticklandii, DSM 519T, ATCC 12662, Peptide Partial, 30 
aa] (Match DNV) 

>gil65750lpirllFEFWF ferredoxin [2Fe-2S] I - food pokeweed (Match DIV) 
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>gil419454lpirllH46328 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain AUS/32) (Match DGV) 

>gil419463lpirllI46328 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain MIY/5 1) (Match DKV) 

>gil419461lpirllB36829 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain ITA/45) (Match DGV) 

>gil817O7lpirllJX0O82 ferredoxin [2Fe-2S] A, leaf - radish (Match DW) 
>gil81752lpirllS06631 lectin - coral tree (Match DAV) 

>gil541283lpirllB49850 chlorin reductase subunit BchX - Rhodobacter capsulatus 
(Match DDV) 

>gil99563lpirllA28965 ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain 
- spinach (fragments) (Match DTV) 

>gi 18 1 70 1 Ipirl IS04906 S-locus-specific glycoprotein S29-2 precursor - wild cabbage 
(fragment) (Match DLV) 

>gil89 156lpirllA053 1 1 apolipoprotein A-I - pig (fragment) (Match DRV) 
>gil89263lpirllB29544 phosphatidylcholine-sterol O-acyltransferase (EC 2.3.1.43) 
peptide B - pig (fragment) (Match DPV) 

>gil9 1 1 252lpatlUSI54 1 1 94 1 1 1 0 Sequence 10 from Patent US 541 1941 
gill608026lpatlUSI5508263U0 Sequence 10 from patent US 5508263 (Match 
DMV) 

>gil911989lpatlUSI5422424ll Sequence 1 from patent US 5422424 (Match DLV) 
>gill06794lpirllS171 12 interferon alpha/beta receptor - human (Match DFV) 
>gilll75245lsplP43996IY421_HAEIN HYPOTHETICAL PROTEIN HI0421. 
gill074400lpirllE64007 hypothetical protein HI0421 - Haemophilus influenzae 
(strain Rd KW20) gil 1573398 (U32725) H. influenzae predicted coding region 
HI0421 [Haemophilus influenzae] (Match DKV) 

>gilll76329lsplP44812IYIIU_HAEIN HYPOTHETICAL PROTEIN HI0668. 
gill074476lpirllD64156 hypothetical protein HI0668 - Haemophilus influenzae 
(strain Rd KW20) gil 1573669 (U32750) hypothetical [Haemophilus influenzae] 
(Match DNV) 

>gil927494 (X89861) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>gil927497 (X89863) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>gil927500 (X89862) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>gil947124lbbsl 163644 ferredoxin component al [Raphanus sativus var. 
longipinnatus=Chinese radish, leaves, seedlings, Peptide, 96 aa] (Match DW) 
>gill363938lpirllS53870 metalloproteinase-3 tissue inhibitor - human 
gil9573 lOlbbsl 165606 hTIMP-3=tissue inhibitor of metalloproteinase-3 {N- 
terminal} [human, Peptide Partial, 18 aa] (Match DIV) 
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>gil971666 (F14634) rho protein dissociation inhibitor homolog [Sus scrofa] 
(Match DIV) 

>gil995573 (U03772) putative transposase [Acinetobacter calcoaceticus] (Match 
DTV) 

>gil995574 (U03772) ORF2 gene product [Acinetobacter calcoaceticus] (Match 
DTV) 

>gil998292 (U33482) ependymin [Gasteropelecus sp.] (Match DGV) 
>gil998306 (U33487) ependymin [Nannobrycon sp.] (Match DGV) 
>gil 1346543lsplP49285IMLl A_CHICK MELATONIN RECEPTOR TYPE 1 A 
(MEL-1A-R). gil 1000104 (U31820) Mel- la melatonin receptor [Gallus gallus] 
(Match DSV) 

>gill001 1 10 (D64001) hypothetical protein [Synechocystis sp.] (Match DSV) 
>gill001 172 (D64001) hypothetical protein [Synechocystis sp.] (Match DGV) 
>gil 1001 295 (D64006) hypothetical protein [Synechocystis sp.] (Match DPV) 
>gill016694 (U3301 1) urease accessory protein G [Mycobacterium tuberculosis] 
gill583729lprfll2121356Eurease:SUBUNIT=G [Mycobacterium tuberculosis] 
(Match DGV) 

>gil 104201 llbbsl 169021 (S78693) cyclic AMP response element-binding protein- 1 
alpha isoform= alpha CREB-1 {alternatively spliced, internal fragment} [human, 
placenta, Peptide Partial, 21 aa] (Match DSV) 

>gil 1050760 (X83665) ribulose-l,5-bisphosphate carboxylase [Rogiera 
suffrutenscens] (Match DPV) 

>gill051157 (X91985) glycoprotein 100 [Marek disease virus type 1] (Match 
DAV) 

>gill052601 (X82442) pid:el22803 [Gallus gallus] (Match DGV) 
>gill061312 (M87661) nonstructural polyprotein [Norwalk calicivirus] (Match 
DGV) 

>gill351660lsplQ09907IYAJ7_SCHPO HYPOTHETICAL 40.2 KD PROTEIN 
C30D1 1.07 IN CHROMOSOME I. gill065894 (Z67961) unknown 
[Schizosaccharomyces pombe] (Match DLV) 

>gill353146lsplQ09637IYRl 1_CAEEL PROBABLE PEPTDDYL-PROLYL CIS- 
TRANS ISOMERASE T27D 1 . 1 (PPIASE) (ROTAMASE). (Match DLV) 
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>gill071799lpirllPA0003 nucleoside-diphosphate kinase (EC 2.7.4.6) - Arabidopsis 
thaliana (fragment) (Match DGV) 

>gil 108335 llpirllPC2239 heat shock protein, high-molecular-mass 105B - mouse 
(fragments) (Match DMV) 

>gill083905lpirllA55209 H transfer determinant A - plasmid R478 gill326033 

(L20341) IncHI2 transfer repressor [Plasmid R478] (Match DEV) 

>gil 1100235 (L48985) resolvase [Pseudomonas syringae] (Match DKV) 

>gil 1122533 (U39944) BELLI [Arabidopsis thaliana] (Match DIV) 

>gilll76915lsplP42955IYSLB_BACSU HYPOTHETICAL 17.3 KD PROTEIN IN 

LYSC 3'REGION. gill 129090 (J03294) ORF; putative [Bacillus subtilis] (Match 

DPV) 

>gil 1139612 (U43400) structural phosphoprotein [Human herpesvirus 7] (Match 
DW) 

>gill 146150 (L43365) fiber protein [Human adenovirus type 2] (Match DGV) 
>gil 11 50923 (X94677) major DNA binding protein [Bovine herpesvirus 1] 
gill491628lgnllPIDIe258523 (Z78205) UL29 [Bovine herpesvirus 1] (Match DMV) 
>gilll60339 (U21000) MerR [Pseudomonas stutzeri] gill586135lprfll2203290A 
merR gene [Pseudomonas stutzeri] (Match DAV) 

>gill 163120 (U43537) ORF1 ; putative ABC excision nuclease repair protein 
[Streptomyces argillaceus] (Match DAV) 

>gil 1164905 (X83637) ribulose-l,5-bisphosphate carboxylase, large subunit 
[Gardenia thunbergia] (Match DKV) 

>gil 117 1462lbbsl 1 7 1023 SnaA=pristinamycin IIA synthase 50 kda subunit {N- 
terminal, internal fragment} [Streptomyces pristinaespiralis, SP92, ATCC 25486, 
Peptide Partial, 20 aa, segment 2 of 2] (Match DFV) 

>gil 1173549 (U31208) NADH dehydrogenase type 1 subunit [Anabaena sp.] 
(Match DWV) 

>gil 1181520 (U42580) A357L [Paramecium bursaria Chlorella virus 1] (Match 
DFV) 

>gilll72748lsplP36672IPTTB_ECOLI PTS SYSTEM, TREHALOSE-SPECIFIC 
IIBC COMPONENT (EIIBC-TRE) (TREHALOSE-PERMEASE IIBC 
COMPONENT) (PHOSPHOTRANSFERASE ENZYME II, BC COMPONENT) 
(EII-TRE). (Match DIV) 

>gill204170 (Z69729) unknown [Schizosaccharomyces pombe] (Match DNV) 
>gill213262 (Z69795) unknown [Schizosaccharomyces pombe] (Match DSV) 
>gill213627 (X95939) type I interferon receptor [Ovis aries] (Match DSV) 
>gill220217 (U49425) Lucilia cuprina beta esterase-related carboxylesterase 
(Lc79) gene, partial cds [Lucilia cuprina] (Match DGV) 

>gill225955lgnllPIDIe228613 (Z70177) homologous to yqbR of the skin element 
[Bacillus subtilis] (Match DKV) 
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>gil 1235 828 (U 11 972) emml gene product [Streptococcus pyogenes] (Match DEV) 
>gill235842 (U 11998) emml gene product [Streptococcus pyogenes] (Match DTV) 
>gill236788 (L07418) polyprotein [Southampton virus] (Match DGV) 
>gill244418 (U26382) VP7 [group A rotavirus] (Match DRV) 
>gill246922lgnllPIDIe 199301 (A27292) 21B4 [Babesia bovis] (Match DFV) 
>gill254543lpatlUSI5486595l8 Sequence 8 from patent US 5486595 (Match DTV) 
>gill262126lgnllPIDIe235301 (Z70601) nonstructural protein 1 [Ery thro virus B 19] 
(Match DLV) 

>gil 1293022 (U50250) ribulose-l,5-bisphosphate carboxylase/oxygenase large 
subunit [Panax quinquefolius] (Match DW) 
>gill 304009 (D84469) SLG12 [Brassica campestris] (Match DLV) 
>gil 1350495 (L47606) ABA-responsive and embryogenesis-associated gene; lea- 
like protein [Picea glauca] (Match DYV) 

>gill360115lgnllPIDIe213272 (Z68147) glycoprotein B equivalent [Phocine 
herpesvirus type 1] (Match DEV) 

>gill352474lsplP80507IIPYR_SYNY3 INORGANIC PYROPHOSPHATASE 
(PYROPHOSPHATE PHOSPHO-HYDROLASE) (PPASE). (Match DRV) 
>gi!1360894lpirllS54285 phosphoglycerate kinase - Thermotoga maritima (Match 
DGV) 

>gill399179 (U49426) 120 kDa immunodominant surface protein [Ehrlichia 
chaffeensis] (Match DIV) 

>gil 1399491 (U49666) Glp repressor [Pseudomonas aeruginosa] (Match DLV) 
>gill435070lgnllPIDIe253922 (X99085) integrase [Ascobolus immersus] (Match 
DYV) 

>gill458198 (U63197) helicase [Hepatitis GB virus C] gill458200 (U63198) 
helicase [Hepatitis GB virus C] gil 1458216 (U63206) helicase [Hepatitis GB virus 
C] gill458218 (U63207) helicase [Hepatitis GB virus C] gill458222 (U63209) 
helicase [Hepatitis GB vims C] (Match DSV) 
>gill458202 (U63199) helicase [Hepatitis GB virus C] (Match DSV) 
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>gil 1458204 (U63200) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458206 (U63201) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458208 (U63202) helicase [Hepatitis GB virus C] gill458220 (U63208) 
helicase [Hepatitis GB virus C] (Match DSV) 
>gill458210 (U63203) helicase [Hepatitis GB virus C] (Match DSV) 
>gil 14582 12 (U63204) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458214 (U63205) helicase [Hepatitis GB virus C] (Match DSV) 
>gil 1458224 (U63210) helicase [Hepatitis GB virus C] (Match DSV) 
>gill480344lgnllPIDIe254807 (X99405) glucose-6-phosphate dehydrogenase 
[Nicotiana tabacum] (Match DLV) 

>gill3 1 1403lpdbl 1 AUSIL Activated Unliganded Spinach Rubisco Mol_id: 1 ; 
Molecule: Ribulose Bisphosphate CarboxylaseOXYGENASE; Chain: L, S; 
Synonym: Rubisco; Ec: 4.1.1.39; Heterogen: Carbon Dioxide; Heterogen: 
Magnesium (Match DTV) 

>gil 149 1736lgnllPIDIe223596 (X95287) archaeal ABC-transporter system 
[Methanosarcina mazeii] (Match DAV) 

>gill592296 (U67506) M. jannaschii predicted coding region MJ0568 
[Methanococcus jannaschii] (Match DKV) 

>gi!1518406lgnllPIDIe220405 (Z69198) ribulose- 1,5-bisphosphate carboxylase, 
large subunit [Triteleia bridgesii] (Match DLV) 
>gill5 18698 (U61753) C3-3 [Oncorhynchus mykiss] (Match DVV) 
>gill526499 (D87414) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526505 (D87417) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526525 (D87427) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526527 (D87428) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526529 (D87429) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526531 (D87430) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526533 (D87431) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gil 1545998 (U60650) polyprotein [Drosophila x virus] (Match DIV) 
>gill553002 (U65978) interferon alpha/beta receptor-1 [Ovis aries] (Match DSV) 
>gill567698lgnllPIDIe254689 (A32883) thrombin inhibitor protein [Rhodnius 
prolixus] gill610446lpatlUSI5523287l5 Sequence 5 from patent US 5523287 
(Match DPV) 
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>gill567700lgnllPIDIe254629 (A32885) thrombin inhibitor protein [Rhodnius 
prolixus] gill610447lpatlUSI5523287l7 Sequence 7 from patent US 5523287 
(Match DPV) 

>gill567702lgnllPIDIe254631 (A32887) thrombin inhibitor protein [Rhodnius 
prolixus] gill610448lpatlUSI5523287l9 Sequence 9 from patent US 5523287 
(Match DPV) 

>gill567704lgnllPIDIe254632 (A32889) thrombin inhibitor protein [Rhodnius 
prolixus] gill610449lpatlUSI5523287lll Sequence 11 from patent US 5523287 
(Match DPV) 

>gill567706lgnllPIDIe254633 (A32891) thrombin inhibitor protein [Rhodnius 
prolixus] gill610450lpatlUSI5523287U3 Sequence 13 from patent US 5523287 
(Match DPV) 

>gill567708lgnllPIDIe254634 (A32893) thrombin inhibitor protein [Rhodnius 
prolixus] gi 1 1 6 1 045 1 IpatlUS 1 5523 287 1 1 5 Sequence 15 from patent US 5523287 
(Match DPV) 

>gil 15677 10lgnllPIDIe254691 (A32895) thrombin inhibitor protein [Rhodnius 
prolixus] gill610452lpatlUSI5523287H7 Sequence 17 from patent US 5523287 
(Match DPV) 

>gil 1575524 (U65005) structural phosphoprotein [Human herpesvirus 7] (Match 
DW) 

>gill607344lpatlUSI5500347l2 Sequence 2 from patent US 5500347 (Match DKV) 
>gill607345lpatlUSI5500347l3 Sequence 3 from patent US 5500347 (Match DKV) 
>gill607346lpatlUSI5500347l4 Sequence 4 from patent US 5500347 (Match DKV) 
>gill607348lpatlUSI5500347l6 Sequence 6 from patent US 5500347 (Match DKV) 
>gill607349lpatlUSI5500347l7 Sequence 7 from patent US 5500347 (Match DKV) 
>gil 1608953lpatlUSI55 1046 1 19 Sequence 9 from patent US 5510461 (Match DHV) 
>gil 16 1 0343lpatlUSI552 1 07 1 12 Sequence 2 from patent US 5521071 (Match DLV) 
>gil 16 1 0926lpatlUSI5527773 13 Sequence 3 from patent US 5527773 (Match DKV) 
>gill610980lpatlUSI5527896l56 Sequence 56 from patent US 5527896 (Match 
DMV) 

>gill610981lpatlUSI5527896l57 Sequence 57 from patent US 5527896 (Match 
DIV) 

>gill61 1666lpatlUSI5539092l98 Sequence 98 from patent US 5539092 (Match 
DKV) 

>gill613384lpatlUSI5559008I67 Sequence 67 from patent US 5559008 (Match 
DDV) 

>gil 16 1 3387lpatlUSI5559008l70 Sequence 70 from patent US 5559008 (Match 
DDV) 

>gill587874lprfll2207325A Antl gene [Aspergillus niger] (Match DEV) 
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>gi!1648974 (U66481) PI structural protein [Hepatitis A virus] gill648990 
(U66489) PI structural protein [Hepatitis A virus] (Match DPV) 
>gi!1648976 (U66482) PI structural protein [Hepatitis A virus] gill648986 
(U66487) PI structural protein [Hepatitis A virus] (Match DPV) 
>gill648978 (U66483) PI structural protein [Hepatitis A virus] (Match DPV) 
>gill648980 (U66484) PI structural protein [Hepatitis A virus] (Match DPV) 
>gil 1648982 (U66485) PI structural protein [Hepatitis A virus] (Match DPV) 
>gill648984 (U66486) PI structural protein [Hepatitis A virus] (Match DPV) 
>gil 1648988 (U66488) PI structural protein [Hepatitis A virus] (Match DPV) 
>gil 165 1445 (D90730) Hypothetical 29.8 KD protein in kdsB-kicB intergenic 
region [Escherichia coli] (Match DKV) 

>gill651926 (D90901) hypothetical protein [Synechocystis sp.] (Match DLV) 
>gill651969 (D90901) hypothetical protein [Synechocystis sp.] (Match DDV) 
>gil 1652043 (D90902) hypothetical protein [Synechocystis sp.] (Match DLV) 
>gill653351 (D90913) HlyB family [Synechocystis sp.] (Match DDV) 
>gill654110 (U14110) melatonin receptor Mel-la [Phodopus sungorus] (Match 
DSV) 

>gil 1655822 (U59320) heat shock protein 60 [Leishmania major] (Match DEV) 
>gil 1657485 (U73857) similar to E. coli o765 [Escherichia coli] (Match DW) 
>gill658269 (U74670) 120 kDa immunodominant surface protein [Ehrlichia 
chaffeensis] (Match DIV) 
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